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editor’s letter 


Andrew A. Chien 


Here Comes Everybody... 
to Communications 


Iam pleased to announce a new 
Communications of the ACM initiative with 
the ambitious goal of expanding 
the Communications community globally. 


I hope it means “Here Comes Every- 
body to Communications.”* Why bring 
everybody to Communications? To in- 
clude important voices and perspec- 
tives in the conversation about the 
present and future of computing. 
With the proliferation of computing 
into every industry, every product, 
and every aspect of society, not only 
has computing spread throughout 
the society and economy of every na- 
tion, but the computing profession 
has spread into communities around 
the globe. 

One natural consequence is that 
invention and innovation in comput- 
ing, once concentrated in a few re- 
gions, is now a global enterprise. And, 
while the technical foundations of 
computing may be universal,” along 
with technical challenges of function- 
ality, scale, reliability, and perhaps 
usability; increasingly, the design of 
many of a system’s most important 
aspects—how they relate to society, 
government, structure of commerce, 
and individual enlightenment and 
perspective as well as fundamental 


a This title borrows from Shirky’s 2008 book 
that described the growing power of groups 
of individuals to organize large-scale activi- 
ties, using Internet tools, and without tradi- 
tional corporate organizations. In fact, this is 
what the ACM has been doing successfully for 
over 50 years. 

b More on this later, as growing excitement about 
neuromorphic and quantum computing sug- 
gest we may soon see a proliferation of comput- 
ing bases. Leading to a question, are we even 
engaged with the full breadth of computing? 


choices about security, privacy, free 
speech, and control—reflect distinc- 
tive regional, national, and commu- 
nity culture. 

Communications, the flagship pub- 
lication of world’s leading computing 
professional society, should be an in- 
clusive forum, spanning this commu- 
nity. It should be a universal forum, an 
inclusive, global community, with ac- 
tive participation from everyone. 

To that end, I am pleased to an- 
nounce a Communications global initia- 
tive. Its goal is to give deeper insight, 
focused coverage, and elevate distinc- 
tive and compelling highlights of com- 
puting drawn from regions around the 
world. This initiative will add a 20-30 
page special section to a few issues 
of Communications each year. Each 
special section will be a collection 
of short pieces, focused on a region 
and chartered to represent the best of 
computing leadership and distinctive 
development. We will bring a sharp 
focus on: 

> Leading technical and research ad- 
vances and activities; 

> Leading and emerging industry 
and research players; 

> Innovation and shape of comput- 
ing in the region; and, 

> Unique challenges and opportuni- 
ties ... and by doing so enrich the entire 
computing community’s perspective! 

Communications’ global initiative 
will visit regions around the world in 
turn, shifting its spotlight to match the 
pace and impact of interesting develop- 


ments in computing. We hope to revisit 
regions about once every two years. 

I am pleased to report that we 
have already begun. The first special 
section will focus on China, where 
we have convened an extraordinary 
team of industry and academic lead- 
ers committed to attend the kick- 
off meeting (set for March 8" at the 
UChicago Beijing Center), and we are 
actively planning successors in other 
parts of the world. 

How we do this is instrumental. 
The special sections will be led by a 
regional team who will nominate, 
select, and drive authorship of the 
section’s content. By design, this 
will encourage active participation 
of a growing global community in 
Communications. To drive creation of 
the regional teams and the entire 
series of special sections, we are 
adding a new section to the Edito- 
rial Board of Communications. Ser- 
endipitously, this creates new op- 
portunities for you to volunteer and 
contribute to the magazine. 

Expect to hear more about this 
late in 2018 when we print the first 
special section! 


Andrew A. Chien, EDITOR-IN-CHIEF 


Andrew A. Chien is the William Eckhardt Distinguished 
Service Professor in the Department of Computer Science 
at the University of Chicago, Director of the CERES Center 
for Unstoppable Computing, and a Senior Scientist at 
Argonne National Laboratory. 


Copyright held by author. 
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cerf’s up 


Vinton G. Cerf 


Unintended Consequences 


HEN THE INTERNET was 
being developed, scien- 
tists and engineers in 
academic and research 
settings drove the pro- 
cess. In their world, information was a 
medium of exchange. Rather than buy- 
ing information from each other, they 
exchanged it. Patents were not the first 
choice for making progress; rather, 
open sharing of designs and protocols 
were preferred. Of course, there were 
instances where hardware and even 
software received patent and licens- 
ing treatment, but the overwhelm- 
ing trend was to keep protocols and 
standards open and free of licensing 
constraints. The information-sharing 
ethic contributed to the belief that 
driving down barriers to information 
and resource sharing was an impor- 
tant objective. Indeed, the Internet as 
we know it today has driven the barrier 
to the generation and sharing of infor- 
mation to nearly zero. Smartphones, 
laptops, tablets, Web cams, sensors, 
and other devices share text, imagery, 
video, and other data with a tap of a fin- 
ger or through autonomous operation. 
Blogs, tweets, social media, and Web 
page updates, email and a host of other 
communication mechanisms course 
through the global Internet in torrents 
(no pun intended). Much, if not most, 
of the information found on the Inter- 
net seems to me to be beneficial; a har- 
vest of human knowledge. But there 
are other consequences of the reduced 
threshold for access to the Internet. 
The volume of information is mind- 
boggling. I recently read one estimate 
that 1.7 trillion images were taken (and 
many shared) in the past year. The Twit- 
tersphere is alive with vast numbers 
of brief tweets. The social media have 
captured audiences and contributors 
measured in the billions. Incen- 
tives to generate and share content 


abound—some monetary, some for the 
sake of influence, some purely narcis- 
sistic, some to share beneficial knowl- 
edge, to list just a few. A serious prob- 
lem is that the information comes in 
all qualities, from incalculably valuable 
to completely worthless and in some 
cases seriously damaging. Even setting 
aside malware, DDOS attacks, hacking 
and the like, we still have misinforma- 
tion, disinformation, “fake news,” 
“post-truth alternate facts,” fraudulent 
propositions, and a raft of other exe- 
crable material often introduced cause 
deliberate harm to victims around the 
world. The vast choice of information 
available to readers and viewers leads 
to bubble/echo chamber effects that 
reinforce partisan views, prejudices, 
and other societal ills. 

There are few international norms 
concerning content. Perhaps child 
pornography qualifies as one type of 
content widely agreed to be unaccept- 
able and which should be filtered and 
removed from the Internet. There are 
national norms that vary from country 
to country regarding legitimate and 
illegitimate/illegal content. The result 
is a cacophony of fragmentation and 
misinformation that pollutes the vast 
majority of useful or at least innocuous 
content to be found on the Internet. The 
question before us is what to do about 
the bad stuff. It is irresponsible to 
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ignore it. Itis impossible to filter in real 
time. YouTube alone gets 400 hours of 
video uploaded per minute (that is 16.7 
days of a 24-hour television channel). 
The platforms that support content 
are challenged to cope with the scale 
of the problem. Unlike other media 
that have time and space limitations 
(page counts for newspapers and 
magazines; minutes for television 
and radio channels) making it more 
feasible to exercise editorial oversight, 
the Internet is limitless in time and 
space, for all practical purposes. 
Moreover, automated algorithms are 
subject to error or can be misled by the 
action of botnets, for example, that pre- 
tend to be human users “voting” in favor 
of deliberate or accidental misinforma- 
tion. Purely manual review of all the 
incoming content is infeasible. The con- 
sumers of this information might be 
able to use critical thinking to reject 
invalid content but that takes work and 
some people are often unwilling or 
unable to do that work. If we are to cope 
with this new environment, we are going 
to need new tools, better ways to validate 
sources of information and factual data, 
broader agreement on transnational 
norms all the while striving to preserve 
freedom of speech and freedom to 
hear, enshrined in the Universal Decla- 
ration of Human Rights.* I hope our 
computer science community will find 
or invent ways to engage, using power- 
ful computing, artificial intelligence, 
machine learning, and other tools to 
enable better quality assessment of 
the ocean of content contained in our 
growing online universe. 


a http://www.un.org/en/universal-declaration- 
human-rights/ 


Vinton G. Cerf is vice president and Chief Internet Evangelist 
at Google. He served as ACM president from 2012-2014. 
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Introducing ACM Transactions 
on Human-Robot Interaction 


Now accepting submissions to ACM THRI 


As of January 2018, the Journal of Human-Robot Interaction (JHRI) has become 
an ACM publication and has been rebranded as the ACM Transactions on 
Human-Robot Interaction (THRI). 


Founded in 2012, the Journal of HRI has been serving as the ) Human: Aabot meeraction 
premier peer-reviewed interdisciplinary journal in the field. 


Since that time, the human-robot interaction field has 
experienced substantial growth. Research findings at 

the intersection of robotics, human-computer interaction, 
artificial intelligence, haptics, and natural language 
processing have been responsible for important discoveries 
and breakthrough technologies across many industries. 


THRI now joins the ACM portfolio of highly respected 
journals. It will continue to be open access, fostering the 
widest possible readership of HRI research and information. 
All issues will be available in the ACM Digital Library. 


Co-Editors-in-Chief Odest Chadwicke Jenkins of the University of Michigan and 
Selma Šabanović of Indiana University plan to expand the scope of the publication, 
adding a new section on mechanical HRI to the existing sections on computational, 
social/behavioral, and design-related scholarship in HRI. 


The inaugural issue of the rebranded ACM Transactions on Human-Robot Interaction 
is planned for March 2018. 


To submit, go to https://mc.manuscriptcentral.com/thri 
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A Declaration of the Dependence 
of Cyberspace 


OHN PERRY BARLOW, the 

famed founder of the Elec- 

tronic Frontier Foundation, 

a digital-rights advocacy 

group, passed away on Feb. 
6, 2018. In 1996, Barlow published “A 
Declaration of the Independence of 
Cyberspace.” It offered a rebuttal to 
Internet governance by national gov- 
ernments, opening with “Govern- 
ments of the Industrial World, you 
weary giants of flesh and steel, I come 
from Cyberspace, the new home of 
Mind. On behalf of the future, I ask you 
of the past to leave us alone.” 

It is hard to believe that such a naive 
view of cyberspace was taken seriously 
just about 20 years ago, that people really 
believed simplistic statements such as 
“We believe that from ethics, enlight- 
ened self-interest, and the common- 
weal, our governance will emerge.” 
But we must remember that 20 years ago 
the Internet was indeed some kind of a 
New World, seemingly outside the 
shackling legacy of traditional gover- 
nance. That was also before the Internet 
and the World Wide Web became dom- 
inated by giant corporations, and be- 
fore Tim Berners-Lee, the recipient of 
the 2016 ACM Turing Award for invent- 
ing the World Wide Web, declared in 
2017 that “The system is failing.” 

What we have also learned in the past 
20 years that while cyberspace may be 
indeed “the new home of Mind,” it is in- 
extricably connected to the physical 
world. Indeed, the economic impact of 
the Web has been and continues to be 
profound. Newspapers are struggling to 
survive because advertising income, 
which has been their main source of rev- 
enue, has migrated to the Web, with 
Google and Facebook as the main bene- 
ficiaries. While e-commerce escalates, 
traditional retail outlets are shuttering 


down daily, suffering from “retail 
apocalypse.”* And while cyberattacks 
are now a daily occurrence, there are 
growing fears of a possible cyberattack 
that will knock out U.S. power grids.” 
What happens in cyberspace does not 
stay in cyberspace! 

To my mind, however, nothing 
epitomizes the hubris of the techno- 
utopianists more than the idea of rein- 
venting money. In October 2008, the 
mysterious Satoshi Nakamoto posted a 
white paper“ on the new domain of bit- 
coin.org, in which he asserted, “What 
is needed is an electronic payment sys- 
tem based on cryptographic proof in- 
stead of trust.” Bitcoin is based on a 
P2P network where transactions are 
cryptographically verified and recorded 
in a public distributed ledger based on 
blockchain, a distributed-consensus 
protocol. We now seem to be in the 
midst of a bitcoin mania, with the value 
of bitcoins gyrating wildly, making 
double-digit moves in a single week. 
There is also significant evidence‘ that 
the bitcoin-exchange markets are be- 
ing manipulated. But bitcoin has only 
been the first of dozens of cryptocurren- 
cies, and initial coin offerings, which 
raise funds for issuing new cryptocur- 
rencies, and are growing in popularity. 

But even if bitcoin solved (quite im- 
perfectly) the verifiability and distribut- 
ed-trust issues, the idea of apolitical 
money is a dangerous fantasy. Verifi- 
ability and trust are only two require- 
ments from a currency. Other require- 
ments, which are intimately related, are 
value and supply. Central banks used to 
strive for a stable currency value. More 
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recently they have come to realize that a 
slightly depreciating currency value 
(about 2% per year) is better for eco- 
nomic growth. To achieve this, central 
bankers use a variety of sophisticated 
monetary tools to manage the money 
supply, taking into account a large 
number of economic indicators. This is 
an enormously complicated task chal- 
lenging the best economic minds. In 
contrast, the supply of cryptocurrencies 
is a priori limited, and its gyrating value 
is determined by trading decisions 
made by “investors.” So the idea that 
apolitical cryptocurrencies will replace 
political money is a delusion. 

Just like other speculative bubbles, 
the cryptocurrency bubble will also blow 
up at some point, though the timing is 
quite unpredictable. But the risk is not 
only to gullible speculators. As time 
goes on, cryptocurrencies get more en- 
meshed in our economic system and 
the risk of financial contagion grows. 
Financial contagion refers to the spread 
of market disturbances, typically on the 
downside, between different economic 
institutions and between different 
countries. The cryptocurrency bubble 
is, in my opinion, a growing systemic 
financial risk (and there are also the 
issues of susceptibility to cyberattacks 
and voracious energy appetite). 

Just as you cannot separate the mind 
and the body, you cannot separate 
cyberspace and physical space. It is 
time to accept this dependence and 
act accordingly. 

Follow me on Facebook, Google+, 
and Twitter. 


Moshe Y. Vardi (vardi@cs.rice.edu) is the Karen Ostrum 
George Distinguished Service Professor in Computational 
Engineering and Director of the Ken Kennedy Institute for 
Information Technology at Rice University, Houston, TX, USA. 
He is the former Editor-in-Chief of Communications. 
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Keep the ACM Code of Ethics As It Is 


HE PROPOSED CHANGES tO 

the ACM Code of Ethics 

and Professional Conduct, 

as discussed by Don Got- 

terbarn et al. in “ACM Code 
of Ethics: A Guide for Positive Ac- 
tion”? (Digital Edition, Jan. 2018), are 
generally misguided and should be 
rejected by the ACM membership. 
The changes attempt to, for example, 
create real obligations on members 
to enforce hiring quotas/priorities 
with debatable efficacy while ACM 
members are neither HR specialists 
nor psychologists; create “safe spac- 
es for all people,” a counterproduc- 
tive concept causing problems in a 
number of universities; counter ha- 
rassment while not being lawyers or 
police officers; enforce privacy while 
not being lawyers; ensure “the public 
good” while not being elected lead- 
ers; encourage acceptance of “social 
responsibilities” while not defining 
them or being elected leaders or those 
charged with implementing govern- 
ment policy; and monitor computer 
systems integrated into society for 
“fair access” while not being lawyers 
or part of the C-suite. 

ACM is a computing society, not a 
society of activists for social justice, 
community organizers, lawyers, po- 
lice officers, or MBAs. The proposed 
changes add nothing related specifi- 
cally to computing and far too much 
related to these other fields, and also 
fail to address, in any significant new 
way, probably the greatest ethical 
hole in computing today—security 
and hacking. 

If the proposed revised Code is ever 
submitted to a vote by the member- 
ship, I will be voting against it and urge 
other members to do so as well. 


Reference 
1. https://dl.acm.org/citation.cfm?id=3173016 


Alexander Simonelis, Montreal, Canada 


Authors Respond: 
ACM promotes ethical and social 
responsibility as key components of 
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professionalism. Computing 
professionals should engage thoughtfully 
and responsibly with the systems they 
create, maintain, and use. Lawyers, 
politicians, and other members of society 
do not always fully understand the 
complexity of modern sociotechnical 
systems; computing professionals can 
help this understanding. Humans 
understand such concepts as harm, 
dignity, safety, and well-being; computing 
professionals can apply them in their 
technical decisions. We invite Simonelis to 
read the Code and accompanying 
materials in more detail, as many of his 
claims, in our opinion, misread the Code. 
We also invite everyone else to read it, 
too; https://ethics.acm.org/2018-code- 
draft-3/ 

Catherine Flick, Leicester, U.K., 

and Keith Miller, St. Louis, MO, USA 


‘Law-Governed Interaction’ for 
Decentralized Marketplaces 

Given today’s sometimes gratuitous 
efforts toward centralized control over 
the Internet, I found it refreshing to 
read Hemang Subramanian’s article 
“Decentralized Blockchain-Based Elec- 
tronic Marketplaces” (Jan. 2018) argu- 
ing that applications like electronic 
marketplaces and social networks 
would benefit from a decentralized im- 
plementation, describing a mechanism 
based on Bitcoin’s concept of block- 
chain imposing distributed protocols, 
or what is called “smart contracts” in 
this context. 

Subramanian did not, however, 
mention the existence of a different, 
older, technique for implementing 
decentralized applications called 
“law-governed interaction,” or LGI, 
introduced in 1991 (under a different 
name) by Minsky.’ It was implement- 
ed at Rutgers University some 10 
years later and is still under develop- 
ment. LGI can be used to implement 
arange of decentralized applications, 
including decentralized marketplac- 
es? and (in 2015) decentralized social 
networks, the very applications that 
attracted Subramanian’s interest. 
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It would have been instructive if 
Subramanian had, say, compared and 
contrasted LGI with blockchain-based 
mechanisms for enforcing distributed 
protocols, as they are two radically dif- 
ferent mechanisms for achieving es- 
sentially the same objective. 


References 

1. Minsky, N.H. The imposition of protocols over open 
distributed systems. IEEE Transactions on Software 
Engineering 17, 2 (1991), 183-195. 

2. Serban, C., Chen, Y., Zhang, W., and Minsky, N. The 
concept of decentralized and secure electronic 
marketplace. The Journal of Electronic Commerce 
Research 8, 1-2 (June 2008), 79-101. 


Naftaly Minsky, Edison, NJ, USA 


Author Responds: 
Comparing LGI and blockchain-based smart 
contracts would be a great idea, as Minsky 
says, as they are radically different 
approaches to decentralization. However, 
from an adoption standpoint what matters 
most is mass adoption at scale. For that to 
happen, the value created by decentralization 
would have to be shared among all users in 
some tangible way. Blockchain-based 
decentralization, in addition to ensuring 
secure low-cost distributed transactions, 
could make network effects fungible through 
the issuance of cryptocoins that can be 
exchanged for fiat currency; for example, 
Steem is a popular social network that issues 
virtual currency powered by the blockchain. 
Hemang Subramanian, Miami, FL, USA 


Scant Evidence for Spirits 

Arthur Gardner’s letter to the editor 
“A Leap from Artificial to Intelli- 
gence” (Jan. 2018) on Carissa Schoe- 
nick et al.’s article “Moving Beyond 
the Turing Test with the Allen AI Sci- 
entific Challenge” (Sept. 2017) asked 
us to accept certain beliefs about arti- 
ficial intelligence. Was he writing 
that all rational beings are necessari- 
ly spiritual? “That which actually 
knows, cares, and chooses is the spir- 
it, something every human being 
has,” he said. And that all humans 
are rational? Why and how would 
someone (anyone) be convinced of 
such a hypothesis? 


Not every human, to quote Gardner, 
“knows, cares, and chooses.” One 
might suspect that no human infant 
does, but may, in fact, learn and devel- 
op them over time. 

What evidence for spirits? Would 
Gardner accept an argument that 
there are no spirits? If not, would this 
not be a rejection of the scientific 
method and evidence-based reason- 
ing? Scientific hypotheses are based 
on experimental design. Valid experi- 
mental designs always allow for “falsi- 
fiability,” as argued by philosopher of 
science Karl Popper (1902-1994). 

Falsifiability (sometimes called test- 
ability) is the capacity for some propo- 
sition, statement, theory, or hypothesis 
to be proven wrong. That capacity is an 
essential component of the scientific 
method and hypothesis testing. 
Through it, we say what we know be- 
cause we test our beliefs using observa- 
tion, not faith. 

Humans are not rational by defi- 
nition. They can think and behave ra- 
tionally or not. Rational beings apply, 
explicitly or implicitly, the strategy of 
theoretical and practical rationality 
to the thoughts they accept and the 
actions they perform. A person who 
is not rational has beliefs that do 
not fully use the information he or 
she has. 

“Man is a rational animal—so at 
least I have been told. Throughout a 
long life I have looked diligently for evi- 
dence in favour of this statement, but 
so far I have not had the good fortune 
to come across it,” said British philoso- 
pher Bertrand Russell (1872-1970), 
tongue firmly planted in cheek. 

One might believe, without evi- 
dence, that “The leap from artificial to 
intelligence could indeed be infinite,” 
as Gardner claimed. However, every 
day newspapers in 22 countries are de- 
signed by my company’s AI-based soft- 
ware for classified pagination and dis- 
play ad dummying. What was once 
done by rational, thinking human de- 
signers is now done by even more ex- 
pert computer programs. And I started 
on this journey in 1973 by writing 
chess algorithms. 

To replace humans, these programs 
have no need to know what a human is 
or to care. 

Our “clever code” may just be our 
DNA that through long biological evo- 


lution has developed into what we to- 
day call consciousness and rationality. 
Perhaps these are just emergent prop- 
erties of a murmuration of neurons. 
Richard J. Cichelli, Nazareth, PA, USA 


Still Looking for Direction in 
Software Development 

Ihave been in IT for 30 years, working 
on every kind of platform and thus 
feel qualified to address several 
points about systems development 
raised by Stephen J. Andriole in his 
Viewpoint “The Death of Big Soft- 
ware” (Dec. 2017). For example, I see 
in many current “agile” cloud-based 
projects a fundamental lack of direc- 
tion. For projects that fail to perform 
as promised, the lack of a more in- 
depth requirements process can lead 
to missing critical integrations with 
other systems. I have personally seen 
at least a dozen projects spiral out of 
control and never reach a real live hu- 
man user. For example, in 2016, I 
worked with a U.S. government agen- 
cy on avery large project it had prom- 
ised to deliver by 2020 but that failed 
a system test in the cloud because it 
could not meet its own integration 
and scalability goals. Even as the de- 
velopment team managed to occa- 
sionally pick off relatively minor user 
requests, it ignored the user-story re- 
quirements with deeper technical 
complexities, as in how to integrate 
with other systems. Lack of integra- 
tion led to missed deadlines for deliv- 
ering the key integrations by system 
test dates, as mandated by Article I, 
Section 2 of the United States Consti- 
tution.? 

As far as how an organization can 
get its data back if it moves from one 
cloud provider to another, the con- 
tainer “solution” might sound nice to 
users but can actually be worse than 
having table dumps from legacy sys- 
tems. The lack of documentation 
around containers, both architectur- 
ally and with respect to how contain- 
ers function within the workflow 
process and how the system will actu- 
ally process data, makes designing for 
portability exceptionally difficult or 
impossible for IT managers to main- 
tain during changes throughout the 
systems life cycle. Another challenge 
of working with containers involves 
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security analysts being able to per- 
form proper system assessments. It 
is, in fact, some of the same micro ser- 
vices Andriole explored that can lead 
to security flaws that are then avail- 
able for exploitation by aspiring hack- 
ers with a library of scripts that can be 
run against the containers and the 
host operating system. 

Though I have great regard for 
cloud projects and the technology 
that allows faster and more-flexible 
solutions to address business needs, 
IT managers must make sure they do 
not lose the major benefits of enter- 
prise resource planning products. I 
spent the 1990s moving from piece- 
meal systems to a system where a 
business user can track raw materials 
all the way to the end product and 
bought and sold with just a few clicks. 
I would hate to see IT managers lose 
that by going back to disparate pro- 
cesses lacking the transparent inte- 
gration I know is possible. 


Reference 

1. Library of Congress. Article 1—Legislative 
Department. U.S. Constitution; https://www.congress. 
gov/content/conan/pdf/GPO-CONAN-2017-9-2.pdf 


Dan Lewis, Washington, D.C., USA 


Author Responds: 
The death of big software is attributable 
to failure, control, governance, cloud, and 
monolithic software, and I thank Lewis 
for addressing failure, cloud, and 
monolithic software. Cloud “containers” 
represent a first step toward hostage 
prevention. I agree that cloud security 
due diligence should always be 
aggressive. I also agree that integration is 
always important but that monolithic 
architectures do not guarantee integration 
(at the expense of flexibility) and that 
micro-services-based architectures can 
integrate and provide functional flexibility, 
with the right tools. 

Stephen J. Andriole, 

Villanova, PA, USA 
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The Costs and Pleasures of 
a Computer Science Teacher 


Mark Guzdial considers the enormous opportunity costs 
of computer science teachers, while Bertrand Meyer ponders 
the pleasures of arguing with graduate students. 


Mark Guzdial 

The Real Costs of 

a Computer Science 

g Teacher are 
Opportunity Costs, and 
Those Are Enormous 
http://bit.ly/2AvL2fz 

December 1, 2017 

Imagine that you are an undergradu- 
ate who excels at science and math- 
ematics. You could go to medical 
school and become a doctor, or you 
could become a teacher. Which 
would you choose? 

Ifyou are in the U.S., most students 
would not see these as comparable 
choices. The average salary for a gen- 
eral practitioner doctor in 2010 was 
$161,000, and the average salary fora 
teacher was $45,226. Why would you 
choose to make a third as much in sal- 
ary? Even if you care deeply about ed- 
ucation and contributing to society, 
the opportunity cost for yourself and 
your family is enormous. Meanwhile 
in Finland, the general practitioner 
makes $68,000 and the teacher makes 
$37,455. Teachers in Finland are not 
paid as much as doctors (http://bit. 
ly/2m3ZnakK), but Finnish teachers 
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make more than half of what doctors 
do. In Finland, the opportunity cost of 
becoming a teacher is not as great as 
in the U.S. 

The real problem of getting 
enough computer science teachers is 
the opportunity cost. We are strug- 
gling with this cost at both the K-12 
(primary and secondary school) level 
and in higher education. 

I have been exchanging email re- 
cently with Michael Marder of UTeach at 
University of Texas at Austin (http:// 
bit.ly/2CKwScu). UTeach (https://uteach. 
utexas.edu/) is an innovative and suc- 
cessful program that helps science, 
technology, engineering, and mathe- 
matics (STEM) undergraduates be- 
come teachers. They do not get a lot of 
computer science (CS) students who 
want to become CS teachers; CS is 
among the majors that provide the 
smallest number of future teachers. A 
2011 U.K. report (http://bit.ly/2CLviXF) 
found that CS graduates are less like- 
ly to become teachers than other 
STEM graduates. 

CS majors may be just as interested 
in becoming teachers. Why don’t they? 
My guess is the perceived opportunity 
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cost. That may just be perception—the 
average starting salary for a certified 
teacher in Georgia is $38,925 (http:// 
www.teachingdegree.org/georgia/salary/), 
and the average starting salary for a 
new software developer in the U.S. 
(not comparing to exorbitant possible 
starting salaries) is $55,000 (http:// 
bit.ly/2CFuQJL). That’s a big differ- 
ence, but it’s not the 3x differences of 
teachers vs. doctors. 

We have a similar problem at the 
higher education level. The National 
Academies of Sciences, Engineering, 
and Medicine just released a report: 
Assessing and Responding to the Growth 
of Computer Science Undergraduate En- 
rollments (you can read it for free or 
buy a copy at http://bit.ly/2CWttnt), 
which describes the rapidly rising en- 
rollments in CS (also described in the 
CRA Generation CS report, discussed 
in a previous blog at http://bit. 
ly/2qiMahP) and the efforts to manage 
them. The problem is basically too 
many students for too few teachers, 
and one reason for too few teachers is 
that computing Ph.D.’s are going into 
industry instead of academia. 

Quoting from the report: 


CS faculty hiring has become 
a significant challenge 
nationwide. The number of new 
CIS (computer and information 
science and support ervices) 
Ph.D.’s has increased by 21% 
from 2009 (1,567 Ph.D.’s) to 
2015 (1,903 Ph.D.s), while CIS 
bachelor’s degree production 
has increased by 74%. During 
that time, the percentage of 
new Ph.D.’s accepting jobs in 
industry has increased 
somewhat, from 45% to 57% 
according to the Taulbee 
survey. Today, academia does 
not necessarily look attractive 
to new Ph.D.’s: the funding 
situation is tight and uncertain; 
the funding expectation of a 
department may be perceived 
as unreasonably high; the class 
sizes are large and not every 
new hire is prepared to teach 
large classes and manage TAs 
effectively; and the balance 
between building a research 
program and meeting teaching 
obligations becomes more 
challenging. For the majority of 
new CS Ph.D.’s, the research 
environment in industry is 
currently more attractive. 


The opportunity cost here influ- 
ences the individual graduate’s 
choice. The report describes new CS 
Ph.D. graduates looking at industry 
vs. academia, seeing the challenges 
of academia, and opting for indus- 
try. This has been described as the 
“eating the seed corn” problem 
(http://bit.ly/2CtEo79). (Eric Roberts 
has an origin story for the phrase at 
his website on the capacity crisis, at 
http://stanford.io/2CXoQtx.) 

That is a huge problem, but a simi- 
lar and less well-documented prob- 
lem is when existing CS faculty take 
leaves to go to industry. I do not 
know of any measures of this, but it 
certainly happens a lot—existing CS 
faculty getting scooped up into in- 
dustry. Perhaps the best-known ex- 
ample was when Uber “gutted” 
CMU’s robotics lab (see the descrip- 
tion at http://bit.ly/2qw2wVh). It hap- 
pens far more often at the individual 
level. I know several robotics, AI, ma- 
chine learning, and HCI researchers 


who have been hired away on extend- 
ed leaves into industry. Those are CS 
faculty not on hand to help carry the 
teaching load for “Generation CS.” 

Faculty do not have to leave cam- 
pus to work with industry. Argo AI, 
for example, makes a point of fund- 
ing university-based research, of 
keeping faculty on campus teaching 
the growing load of CS majors (http:// 
bit.ly/2CuHA2r). Keeping the re- 
search on-campus also helps to fund 
graduate students (who may be future 
CS Ph.D.’s). There is likely an oppor- 
tunity cost for Argo AI; by bringing 
the faculty off campus to Argo full- 
time, they would like get more re- 
search output. There is an associated 
opportunity cost for the faculty; go- 
ing on leave and into industry would 
likely lead to greater pay. 

On the other hand, industry that in- 
stead hires away the existing faculty 
pays a different opportunity cost. When 
the faculty goes on leave, universities 
have fewer faculty to prepare the next 
generation of software engineers. The 
biggest cost is on the non-CS major. 
Here at Georgia Tech and elsewhere, it 
is the non-CS majors who are losing the 
most access to CS classes because of 
too few teachers. We try hard to make 
sure that the CS majors get access to 
classes, but when the classes fill, it is 
the non-CS majors who lose out. 

That is a real cost to industry. A re- 
cent report from Burning Glass (http:// 
bit.ly/2EdZvL5) documents the large 
number of jobs that require CS skills, 
but not a CS major. When we have too 
few CS teachers, those non-CS majors 
suffer the most. 

In the long run, which is more pro- 
ductive: Having CS faculty working full- 
time in industry today, or having a 
steady stream of well-prepared com- 
puter science graduates and non-CS 
majors with computer science skills 
for the future? 


Comments 
Great article. The first part of this 
highlights one of my major complaints 
about teacher's unions in the U.S. They 
push very hard to keep the pay for all 
teachers, regardless of subject taught, 
equal. In the case of CS, they are clearly 
hurting education by doing so. 

I also have a comment on the 
opportunity cost analysis at the 
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beginning comparing doctors, teachers, 
and entry-level software developers. To 
do that comparison properly, you have to 
take into account the fact that doctors 
have to stay in school a long time. 
Doctors don't start making that kind of 
money until after four years of medical 
school and ~three years of residency. 
Both teaching and software development 
can get jobs right out of undergrad. So 
you have to factor in seven years of lost 
wages for the doctor. At that point, the 
salary for the teacher has risen a little, 
while that for the software developer 
has gone up quite a bit. So while I agree 
completely with the issue of opportunity 
cost, I think that this example needs 
more details to be complete. 

—Mark Lewis 


Thanks, Mark! Great point about the 
relationship between years of school and 
salary. I agree. 

—Mark Guzdial 


Bertrand Meyer 
Small and 
_ Big Pleasures 
http://bit.ly/2Cz77eQ 
December 19, 2017 
One of the small plea- 
sures of life is to win a technical ar- 
gument with a graduate student. You 
feel good, as well you should. It is 
only human to want to be right. Be- 
sides, if you ended up being wrong 
all or most of the time, you should 
start questioning your sanity: Why 
are they the students and you the 
supervisor, rather than the other 
way around? (One of the most hypo- 
critical lies in the world is the cliché 
“I make sure to hire people who are 
smarter than I am.” Sure. So obvious- 
ly uttered—unless the person you are 
hiring is your successor—for the sole 
purpose of making you look whip- 
smart. If it were sincere, why then 
would you stay on?) 

One of the big pleasures of life is to 
lose an argument with a graduate stu- 
dent. Then you have learned something. 


Mark Guzdial is a professor in the College of Computing 
at the Georgia Institute of Technology in Atlanta, GA, USA. 
Bertrand Meyer is professor of Software Engineering at 
ETH Zurich, the Swiss Federal Institute of Technology; 
research professor at Innopolis University (Kazan, Russia), 
and chief architect of Eiffel Software (based in Goleta, 

CA, USA\). 
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In Pursuit of Virtual Life 


Scientists are simulating biological organisms 
and replicating evolution in the lab. How far can 
they expand the boundaries of virtual life? 


T FIRST GLANCE, the crea- 
ture known as Caenorhab- 
ditis elegans—commonly 
referred to as C. elegans, 
a type of roundworm— 
seems remarkably simple; it is com- 
prised of only 959 cells and approxi- 
mately 302 neurons. In contrast, the 
human body contains somewhere 
around 100 trillion cells and about 100 
billion neurons in the brain. Yet decod- 
ing the genome for this worm and digi- 
tally reproducing it—something that 
could spur enormous advances in the 
understanding of life and how organ- 
isms work—is a challenge for the ages. 
“The project will take years to 
complete. It involves enormous time 
and resources,” says Stephen Larson, 
project coordinator for the Open- 
Worm Foundation. Larson, a neuro- 
scientist who is CEO of data software 
firm MetaCell, is not the only person 
focused on digitally reproducing life, 
or replicating evolution inside a com- 
puter. Researchers from a variety of 
fields are now attempting to decode 
worms, fly brains, and evolutionary 
processes in order to create virtual 
organisms and simulations of living 
creatures. It is safe to say that the field 
of executable biology—constructing 
computational models of biological 
systems—is coming to life. 


Samuel Greengard 


Simulation of electrical activity in a “virtual brain slice” formed from seven unitary digital 
reconstructions of neocortical microcircuits. 


Ultimately, this research could lead 
to a far greater understanding of how 
neurons fire and brains and entire 
organisms function. This knowledge 
would likely lead to new therapies 
and drugs for treating sickness and 
disease, but it could also produce 
new biofuels, as well as entirely new 
computing frameworks. It also raises 
questions about what constitutes life 
and whether living things can be engi- 
neered inside a computer. 


Says Larson: “Understanding how or- 
ganisms function would unlock many of 
the secrets of nature and change the way 
we view and interact with the world.” 


Beyond Biology 

The idea of developing virtual organ- 
isms and simulating physical systems 
through computing is nothing new. In 
the late 1940s, John von Neumann be- 
gan exploring the concept of creating a 
computer virus modeled after a biologi- 
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cal virus; he eventually developed the 
first self-replicating program. In 1970, 
mathematician John Horton Conway in- 
troduced a cellular automation system 
called Conway’s Game of Life, in which a 
person configures a set of circles, then 
the computer embarks on a rudimen- 
tary evolutionary process. 

By the 1990s, a number of research- 
ers had begun to explore the idea of pro- 
ducing digital representations of biolog- 
ical creatures. 

For scientists, the idea of creating 
virtual life and artificial worlds inside 
a computer is rooted in practicality: it 
makes possible the study of the genet- 
ic information of an organism, or the 
creation of a virtual space to study how 
evolution and adaption take place. “Re- 
searchers can run thousands and thou- 
sands of replicates simultaneously. Ev- 
ery computer is essentially a Petri dish,” 
explains Christoph Adami, professor 
of microbiology and molecular genet- 
ics at Michigan State University. This 
approach also allows researchers to 
isolate specific components, including 
genetic coding, and “very carefully tease 
apart the different elements that go into 
the evolutionary process,” he says. 

OpenWorm is an example of how 
this new frontier of biology and com- 
puting is unfolding. So far, “several 
hundred people” have contributed 
to the project in some way. This in- 
cludes computer scientists, math- 
ematicians, biologists, and experts in 
neuroscience. Among the core partic- 
ipants are more than a dozen academ- 
ic and research luminaries, including 
C. elegans biologists Sreekanth Cha- 
lasani at the Salk Institute, Michael 
Francis at the University of Massachu- 
setts Medical School, William Schafer 
at University of Cambridge, and An- 
drew Leifer at Princeton University. In 
addition, the organization has received 
computing input from the likes of Net- 
ta Cohen at Leeds University and Chris- 
tian Grove at the California Institute of 
Technology (CalTech). 

The OpenWorm project is now near- 
ly seven years old. Larson estimates the 
project is 80% of the way toward achiev- 
ing its first goal: assembling a digital 
model of the worm that allows research- 
ers to simulate movement through sim- 
ulated viscous fluids. The team hopes to 
achieve this milestone by late this year. 
This has involved mapping cells and 
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E 
Mountains of data 
produce 

incremental gains, 
and coordinating 

all the research 
groups and silos is 

a complex endeavor. 


functions in the worm’s body, develop- 
ing software to run simulations, build- 
ing a digital model of C. elegans, and 
constructing an algorithm that simu- 
lates the worm’s muscle movements— 
including how electrical signals travel 
through its brain and nervous system. 
So far, researchers at Caltech have 
developed the OpenWorm Browser, 
which relies on a Web or iOS interface 
to display a three-dimensional anatom- 
ical model and actions for C. elegans. 
The browser displays different layers, 
including the skin, alimentary system, 
nervous system, reproductive system, 
and body wall muscles. In addition, a 
program called Sibernetic uses a C++ 
algorithm to model and simulate con- 
tractile matter and membranes within 
the muscle tissue of the C. elegans. An- 
other platform, Gepetto, provides an 
open source Web-based neuroscience 
simulation and visualization environ- 
ment that simulates complex biologi- 
cal systems and their surrounding en- 
vironment using multiple algorithms. 
Not surprisingly, the data process- 
ing challenges related to OpenWorm 
and developing a life-like digital mod- 
el are enormous; the overall task of 
understanding things like synthesis, 
reproduction, and digestion will likely 
take several more years. For now, re- 
searchers rely on a combination of 
classical mathematical and analytics 
tools along with machine learning to 
decode functions at the level of ion 
channels and cells. “Cells have a lot 
of extra machinery in them that is dif- 
ficult to detect, and many of these ac- 
tivities and processes are completely 
ignored by artificial neural nets,” Lar- 
son says. Simply put: mountains of 
data produce incremental gains, and 
coordinating all the research groups 
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and silos is a complex endeavor. Ulti- 
mately, “We may need to get to a new 
type of computing process to under- 
stand the exotic dynamics of natural 
neural systems,” he says. 


Cracking the Code 

The OpenWorm project is one of sev- 
eral current attempts to unravel the 
mysteries of living things. 

For example, Virtual Fly Brain—a 
joint effort involving the University of 
Edinburgh, University of Cambridge, 
MRC Laboratory of Molecular Biology, 
Cambridge, and the European Bioin- 
formatics Institute—is mapping the 
physiology of the household fly. 

In 2012, Jonathan Karr at the Insti- 
tute for Genomics & Multiscale Biol- 
ogy Institute at the Mt. Sinai School of 
Medicine in New York City assembled 
the first whole-cell model of Mycoplas- 
ma genitalium, a pathogenic bacterium 
that resides in humans. The model 
succeeded in predicting the viability of 
cells after genetic mutations. 

In 2016, Stanford University bio- 
engineering professor Markus W. 
Covert and a team of researchers de- 
veloped a whole-cell computational 
model; they used detailed information 
from more than 900 scientific journals 
to gain insights into previously unob- 
served cellular behaviors. 

There’s also the work of Henry 
Markram, a professor of neuroscience 
at the Ecole Polytechnique Fédérale de 
Lausanne in Switzerland, director of 
that institution’s Laboratory of Neural 
Microcircuitry, and founder and di- 
rector of the Swiss Blue Brain Project 
national brain initiative. His research 
has focused on synaptic plasticity and 
the microcircuitry of the neocortex. 
In 2005, he launched the initiative in 
order to reconstruct and simulate the 
mammalian brain, starting with the ro- 
dent neocortical column. Markram and 
fellow researchers are now attempting 
to reverse-engineer the circuitry of the 
brain—something that could radically 
redefine health and medicine. 

Make no mistake, these projects ex- 
tend far beyond a basic understanding 
of physiological mechanisms. An organ- 
ism’s behavior is affected by numerous 
factors, ranging from its environment 
to its genetics. This means that even 
when scientists decode the genome of 
a creature such as C. elegans, it remains 


incredibly challenging to understand 
how cells function alone and together, 
and how they interact with the envi- 
ronment to adapt, adjust, and evolve. 
“Achieving a complete understanding 
of a worm requires incredible resourc- 
es. Understanding the mechanisms in 
more advanced lifeforms is still very far 
off into the future,” says Herbert Sauro, 
associate professor of bioengineering at 
the University of Washington. 

“It’s painstaking and arduous work 
to put all the pieces together,” says Alex- 
ander Hoffman, professor of immunol- 
ogy and microbiology at the University 
of California, Los Angeles. “It’s neces- 
sary to pull together research from avery 
large pool of existing literature, code all 
the information in a set of equations 
and parameters, and then work with 
computing software to relate model 
simulations to all the data. The problem 
for now is that there’s often not enough 
existing knowledge to deliver an accu- 
rate simulation of a phenotype—and 
so you wind up with gaps in knowledge 
that require further experimentation.” 


Mind Games 

A primary goal of these projects, and ex- 
ecutable biology in general, is to pro- 
duce reliable computer models that 
ultimately can be used to understand 
the behavior of cancer cells and ad- 
dress other debilitating or life-threat- 
ening diseases, ranging from multi- 
ple sclerosis and amyotrophic lateral 
sclerosis to heart disease and arthri- 
tis, says Sauro. “Understanding the 
internal machinery of cells and how 
they successfully orchestrate cellular 
remodeling in a way that doesn’t harm 
them could lead to faster and better 
ways to develop therapies and drugs.” 

A deeper understanding of cellular 
activity also could help researchers 
engineer and reengineer organisms 
to produce biofuels and other chemi- 
cal substances, or to produce entirely 
new categories and types of antibiot- 
ics and other medicines. 

This field of research may also have 
enormous implications for comput- 
ing, Larson says. Today, computational 
neuroscience attempts to faithfully re- 
produce the activity of neurons. Howev- 
er, neural nets do not exactly replicate 
the actions and behaviors of biologi- 
cal cells and neurons. “It’s not obvious 
what information processing neurons 


are doing when you consider them as 
biological cells. There are more exotic 
dynamics taking place, but we cannot 
see them,” he says. However, “It may 
be possible to develop more advanced 
types of computing systems based on 
biology. It appears that there is more 
we can do with neural nets than we have 
been doing with deep learning.” 

Of course, the ultimate questions for 
researchers and society are where will 
all of this lead, and how exactly do we 
define life? At some point, digital code 
could replicate biological code for en- 
tire creatures, and researchers in syn- 
thetic biology might compile code to 
engineer new types of organisms—or 
autonomous devices, such as robots, 
that use biological models to think. 
This may bring the world to the highly 
discussed state of “singularity,” where 
intelligence becomes increasingly non- 
biological and humans transcend their 
biological origins. 

Concludes Sauro: “In the future, we 
could see very different definitions of 
life. Once you start creating and evolv- 
ing life-like behaviors in computer 
code or in synthetic biological systems 
and then applying them to the physical 
world, it’s possible to produce a very 
different reality.” 
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UNDERSTANDING 

THE POWER COMPUTING 
CAN PROVIDE 

Allison Druin is 
the inaugural 
Associate 
Provost for 
Research & 
Strategic 
Partnerships at 
the Pratt Institute in Brooklyn, 
NY. Her early education as a 
graphic designer, combined 
with her technology 
background, provides her with 
a skill set well suited for this 
new role. 

Druin earned her 
undergraduate degree in 
Graphic Design at the Rhode 
Island School of Design, her 
Master’s degree in Media 
Arts & Sciences from the 
Massachusetts Institute of 
Technology (MIT) Media 
Lab, and her Ph.D. from the 
University of New Mexico’s 
College of Education. During 
her time at MIT, developing 
new technologies for children 
became Druin’s personal 
research focus. “When you 
want to make new technology 
for children, there is no 
straight line from here to there,” 
Druin explains. 

During two decades as a 
professor at the University of 
Maryland (UM), Druin served in 
numerous roles, including lab 
director, associate dean, and 
chief futurist. She describes 
her experience as giving voice 
to users in the innovation 
process, and in developing an 
understanding of the impact of 
new innovations. 

In 2015, Druin took leave 
from UM to work as Special 
Advisor for National Digital 
Strategy for the National 
Park Service, where she led 
strategic planning to bring 
digital experiences to park 
visitors, and to also enhance 
digital preservation. 

In her new role, Druin will 
lead the initiative to expand 
research within Pratt, as well 
as with the Institute’s external 
partners. “You don’t have to 
be a computer scientist to 
understand the power that 
computing can bring to the 
world,” she says. 

— John Delaney 
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The Construction Industry 
in the 21* Century 


Three-dimensional printing and other new technologies 
are revitalizing the business of building buildings. 


HE CONSTRUCTION OF New 

York’s Empire State Build- 

ing is often seen as the figu- 

rative and literal pinnacle 

of construction efficiency, 
rising 1,250 feet and 102 stories from 
the ground to its rooftop spire in just 
over 13 months’ time, at a human 
cost of just five lives. Indeed, most of 
today’s construction projects would 
be lucky to come close to that level 
of speed, regardless of the build- 
ing’s size. While the construction 
industry traditionally has been slow 
to change the way it operates, several 
new technologies are poised to usher 
in a new era of faster and more auto- 
mated construction practices. 

Three-dimensional (3D) printing is 
among the key technologies that are 
expected to change the way structures 
are built in the future, as construction 
engineers and contractors seek meth- 
ods for completing buildings more 
quickly, more efficiently, and, in many 
cases, with a greater attention paid to 
sustainability. Large printers that can 
print construction materials such as 
foam or concrete into specific shapes 
can drastically speed up the creation of 
walls, decorative or ornamental pieces, 
and even certain structural elements. 
Furthermore, in some scenarios, cus- 
tom-built or unique items can be cre- 
ated onsite or in a factory, at a much 
lower cost than by using traditional, 
one-off casting techniques. 

“If you can focus on [printing] the 
more labor-intensive components of 
the structure, then the productivity will 
increase,” says Pelin Gultekin-Bicer, 
Building Information Modeling (BIM)/ 
Virtual Design and Construction (VDC) 
project manager at VIATechnik LLC, a 
construction and engineering services 
firm based in Chicago. In addition, she 
says, “if you can integrate the 3D print- 
ers onsite, then you can improve the 
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A concrete villa in Binzhou, China, that was produced by a 150-meter-tall 3D printer. 


productivity, as [the construction 
schedule] will be more predictable.” 
The most common commercial 
use of 3D printing today is to create 
the molds that are used to cast con- 
crete panels for use in a building or 
tunnel, which is how the technology 
is being used by Australia’s Laing 
O’Rourke, a multinational construc- 
tion company. Laing O’Rourke’s 
FreeFAB technology employs employs 
a 100x25x15-foot robotic 3D printer 
that prints molds measuring 6x4.5 
feet from a specially designed wax. 
These molds are used to cast large 
concrete panels at an offsite factory 
that are unique, in terms of size or 
shape, and likely would require a 
large traditional wood or polystyrene 
mold to be created for each panel. 
According to the company, the 
FreeFAB technology is more eco-friend- 
ly and efficient than creating conven- 
tional wood or polystyrene molds, be- 
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cause it can print to an exact shape 
with micron-level accuracy, and the 
wax molds can be melted down for re- 
use again and again. 

The completed panels are shipped 
and installed into the passenger tun- 
nels of London’s Crossrail railway 
construction project. Crossrail will 
be a high-frequency, high-capacity 
railway serving London and the 
Southeast U.K. According to James 
Gardiner, CTO and cofounder of 
FreeFAB, a spinout of Laing O’Rourke, 
utilizing 3D printing to create the 
molds, rather than the panels them- 
selves, incorporates all of the bene- 
fits (speed and quick customization) 
of 3D printing, while minimizing the 
technology’s weaknesses. 

Gardiner notes that 3D concrete 
printers require very specific tempera- 
ture and humidity levels in order for 
concrete to set properly. In addition, if 
a 3D printer pauses or stops, the homo- 
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geneity of each layer of concrete, and 
the bonds between each layer, will be 
compromised. This introduces ques- 
tions about whether the material and 
building process will be able to be cer- 
tified by local building authorities, giv- 
en that any material used in construc- 
tion must meet load-handling, 
weatherproofing, and other specific 
building code issues. 

“[Research universities] are start- 
ing to focus on the characterization of 
3D materials, trying to develop specif- 
ic materials that are particularly well- 
suited for 3D printing and under- 


standing those already used,” 
Gardiner says. “But I’d say that it’s still 
got a little way to go.” 


Other organizations, however, have 
taken to using 3D printing to construct 
entire buildings. Chinese company Win- 
Sun claims that in 2008, it printed an en- 
tire house using 3D printing technology 
in just two days. The company has since 
3D-printed larger structures, such as a 
five-story section of a city block, as well 
as one of its own manufacturing plants 
in Suzhou Industrial Park. Each of these 
projects took about a month, far less 
time than would be required with tradi- 
tional construction techniques. 

According to industry professionals, 
however, WinSun’s 3D-printed build- 
ing technology still requires that the in- 
dividual walls, floors, and roof be fas- 
tened together using traditional 
methods, including bolting walls and 
roofs together, and much of the inter- 
nal elements of a house (ductwork, 
electrical conduit, plumbing, and other 
finishes) must still be installed, rather 
than printed. 

“They’re not printing an entire six- 
story apartment building in one go,” 
explains Casey Mahon, digital practice 
manager at Carrier Johnson, an archi- 
tectural, interior design, and branding 
practice based in San Diego, CA. 
“They’re printing it in pieces. Those 
pieces are coming out to the job site, 
and they’re getting assembled. Then 
they’re getting clad with some tile or 
stone on the exterior. It’s rare that you 
ever see an interior photograph of one 
of those projects. When you do, you 
can see that, it’s clear, all the conduit 
is surface-run, all the fixtures are sur- 
face-mounted fixtures.” 

Meanwhile, researchers in France 
from the University of Nantes, 


Nantes Métropole, Nantes Métropo- 
le Habitat (NMH), and Ouest Valori- 
sation, with help from teams at the 
Nantes Digital Science Laboratory 
and the Institute of Research in Civil 
and Mechanical Engineering, are 
working to create an industrial 3D 
printer that will be able to build a 
demonstration house in only a few 
days. Called BatiPrint3D, the printer 
was completed in the fall of 2017. 
The device prints the home in three 
layers, including two foam layers for 
insulation, and a third concrete layer. 
Of course, the technology is simply 
being demonstrated, and is not yet 
ready for mass commercialization. 

Similar to the WinSun house, a real 
functioning house would need to have 
its infrastructure and finishes added 
after printing, thereby putting real con- 
struction times in line with those of 
traditionally built houses. 

That is why it is most likely that 3D 
printing will be primarily used either 
offsite, to facilitate the faster creation 
of molds to create one-off castings of 
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unique panels or parts, or will be de- 
ployed onsite to print unique or be- 
spoke architectural or decorative 
pieces of a structure, rather than en- 
tire houses, in the near future. By ad- 
dressing the labor-intensive pieces 
via onsite printing, 3D printing likely 
will improve the predictability of 
scheduling of material delivery and 
improve worker productivity, which 
together are responsible for much of 
the costs related to construction. 

“The biggest problem of the con- 
struction industry is the variance in 
schedule and cost estimation,” says 
Gultekin-Bicer, explaining that these in- 
clude material costs, shipping costs, 
and labor costs. “Cost estimations are 
derived from the schedule, so if you ex- 
tend the schedule of the building, you 
are also extending the cost,” she says, 
noting that the variability in the sched- 
ule is largely due to the human labor 
factor. “If you can integrate the 3D 
printers onsite, then we can improve 
the productivity, as [the schedule] will 
be more predictable.” 


Autonomous Vehicles 
Help Build the Future 


It is likely autonomous vehicles that can be operated within an enclosed setting, such 
as vehicles used on construction sites, will gain traction before self-driving cars do. 

San Francisco-based Built Robotics is equipping construction vehicles with 
technology that allows them to do work that is dangerous, repetitive, or both, with 
little human intervention. Late last year, the company launched an autonomous track 
loader (ATL) that uses a combination of LIDAR sensors, inertial measurement units, 
and global positioning system (GPS) technology to handle basic construction site tasks, 


such as digging foundation holes. 


Caterpillar, one of the world’s largest manufacturers of heavy equipment, has 
deployed more than 100 autonomous haul trucks to mines throughout the world. 
According to the company, Caterpillar worked with Blacksburg, VA-based Tore 
Robotics for a decade to develop its RemoteTask skid steer remote control system. 
Caterpillar also partnered with Torc to develop a system (due for release early 
next year) in which a Komatsu 930E haul truck can be operated by the Caterpillar 


autonomous haul truck system. 


“I think the biggest challenge to autonomy is going to be on the security side,” 
says Bob Schena, chairman, CEO, and co-founder of Rajant Corp., a wireless mesh 
networking company that works with construction equipment manufacturers. “If 
governments and regulators believe that equipment could be hacked, taken over, 
and controlled by bad actors, they’re not going to allow autonomy. So, this is an 
issue that will have to be addressed, and we’re very involved in that.” 

“Right now, we are working with construction companies that are going to use 
our solutions in order to securely connect their equipment,” says Moshe Shlisel, 
CEO of Ramle, Israel-based GuardKnox. Shlisel says many construction companies 
are using relatively weak connectivity software to actively monitor and control heavy 


equipment, such as cranes 200 feet tall. 


“The manufacturer of the construction equipment would like to be able to 
monitor and to predict some malfunctions [by keeping an active and constant 
wireless connection to the crane], Shlisel says. “But you have to do that securely, 
otherwise you’re having a connected motor or a connected crane that is exposed 
to every hacker that, just for fun, says to himself, ‘well, let’s see if I can control this 


crane remotely’.” —K.K. 
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EES 
A bricklaying robot 
called SAM100 can 
lay 3,000 bricks 

a day, six times 

as many as a typical 
human bricklayer. 


In the future, Gardiner notes, 3D 
printing will allow for a more stream- 
lined use of materials, thanks to the 
ability to print both structural and 
decorative pieces of a building that 
place material only where needed for 
maximum strength, energy efficiency, 
or form, or to help them fit into unusu- 
al or tight site constraints. 

“At the moment, if you build a con- 
crete wall or a brick wall, you’re putting 
the same bulk material uniformly, re- 
gardless of where the stresses are on that 
particular element,” Gardiner says, ob- 
serving that 3D printing will allow archi- 
tects to design building components 
that put material only where they are ab- 
solutely needed, rather than simply ad- 
hering to a mass-produced shape. 

It is not just 3D printing that is 
poised to disrupt the construction in- 
dustry. Created by New York-based 
company Construction Robotics, a 
bricklaying robot called SAM100 can 
lay 3,000 bricks per day, effectively 
multiplying a typical human bricklay- 
er’s productivity of about 500 bricks 
per day by six. The SAM100 (whose 
name stands for “semi-automated 
mason”) uses a conveyor belt, robotic 
arm, and concrete pump to lay bricks. 
The robot’s software ensures the ro- 
bot can quickly choose between types 
of bricks, quickly lay bricks in compli- 
cated patterns, and strictly adhere to 
the building plan. However, the tech- 
nology still requires a human operator 
to smooth the concrete before placing 
additional layers of bricks. 

“Automation is popular in a con- 
trolled environment, but construction 
sites are not very controlled environ- 
ments,” says Zak Podkaminer, a market- 
ing executive with Victor, NY-based Con- 
struction Robotics. He cites weather 
(such as rain, snow, and humidity), 
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space constraints, and the changing lay- 
out of the site as construction progress- 
es as factors that can make a construc- 
tion site less than ideal for automation. 
“There’s a shortage of skilled workers 
that are going into the construction 
trades, so what the SAM does is allow 
each worker to be more productive.” 

Podkaminer notes that while the ro- 
bot is expensive, there has been a lot of 
interest from contractors and masons 
in renting the robot, which allows it to 
be used by contractors that normally 
wouldn’t have enough work to amor- 
tize its approximately $500,000 pur- 
chase price over a longer time horizon. 

Still, despite the increasing use of 
automation and 3D printing on the job 
site, the construction industry is notori- 
ously slow when it comes to adopting 
new technology. This is largely due to 
the highly regulated nature of construc- 
tion, as well as the high cost of adopting 
new technology, as Gardiner notes that 
the 3D printer used to create the mas- 
sive molds can cost $1 million. 

“There are inexpensive 3D construc- 
tion [concrete] printers around,” Gar- 
diner says, “but these machines are 
generally trading off accuracy and re- 
liability to achieve their low cost. So, 
one of the things that I see is that a 
good concrete printer, a good con- 
struction 3D printer, will need to be 
highly reliable, fast, and accurate. And 
the problem with that is that a ma- 
chine that has those things is generally 
more expensive.” 
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2011, Exploring the Emerging Design 
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Esther Shein 


The State of Fakery 


How digital media could be authenticated, 
from computational, legal, and ethical points of view. 


ACK IN 1999, Hany Farid was 
finishing his postdoctoral 
work at the Massachusetts 
Institute of Technology 
(MIT) and was in a library 
when he stumbled on a book called The 
Federal Rules of Evidence. The book 
caught his eye, and Farid opened to a 
random page, on which was a section 
entitled “Introducing Photos into a 
Court of Law as Evidence.” Since he 
was interested in photography, Farid 
wondered what those rules were. 

While Farid was not surprised to 
learn that a 35mm negative is consid- 
ered admissible as evidence, he was 
surprised when he read that then-new 
digital media would be treated the 
same way. “To put a 35mm file and a 
digital file on equal footing with re- 
spect to reliability of evidence seemed 
problematic to me,” says Farid, now a 
computer science professor at Dart- 
mouth College and a leading expert on 
digital forensics. “Anyone could see 
where the trends were going.” 

That led Farid on a two-decades- 
long journey to consider about how 
digital media could be authenticated 
from the computational, legal, and eth- 
ical points of view. He and others have 
their work cut out for them; there are 
dozens of ways a digital image can be 
manipulated, from doing something 
as simple as cropping or lightening it, 
to something more nefarious. 

As fake news dominates headlines 
and the use of artificial intelligence (AI) 
to alter images, video, or photographs 
is rampant, media outlets, political 
campaigns, ecommerce sites, and even 
legal proceedings are being called into 
question for the work they generate. 
This has led to various efforts in govern- 
ment, academia, and technological 
realms to help identify such fakery. 

“More and more, we’re living in a 
digital world where that underlying 
digital media can be manipulated and 
altered, and the ability to authenti- 
cate is incredibly important,” says 


A real dog (left), and an image of a dog created by a deep convolutional generative 


adversarial network (GAN) algorithm. 


Farid. Videos can go viral in a matter 
of minutes; coupled with the pace of 
technological advance and the ability 
to easily deceive someone online, how 
is it possible to trust what we’re see- 
ing coming out in the world? 

It is a troubling question, and no 
one, it seems, is immune from being 
targeted. Virgin Group founder Sir 
Richard Branson revealed he has 
been the target of scams using his im- 
age to impersonate him. In a blog 
post in January 2017, Branson noted 
that “the platforms where the fake 
stories are spreading need to take re- 
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sponsibility ... and do more to prevent 
this dangerous practice.” 

People need to be skeptical of what 
they see and hear, says David Schub- 
mehl, research director, Cognitive/AI 
Systems and Content Analytics at mar- 
ket research firm IDC. 

“People have to get used to the idea 
that images, voice recordings, video, 
and any other forms of media that can 
be represented as digital data can be 
manipulated and changed in one or 
more ways through the use of software,” 
Schubmehl says. 

Because machine learning is becom- 
ing so prevalent, experts say we need 
ways to improve its ability to spot decep- 
tion. For example, Schubmeil says, “To- 
day, researchers are experimenting with 
generative adversarial networks (GANs) 
that can be used to combine two differ- 
ent types of images or video together to 
create a merged third type of video.” 

The idea behind GANS is to have two 
neural networks, one that acts as a “dis- 
criminator” and the other a “genera- 
tor,” which compete against each other 
to build the best algorithm for solving a 
problem. The generator network uses 
feedback it receives from the discrimi- 
nator to learn to produce convincing 
images that can’t be distinguished 
from real images; this helps it get better 
at detecting something false. 
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Of course, image altering has its 
(legal) benefits. It has allowed the 
movie industry to produce spectacu- 
lar action movies, since the vast ma- 
jority of that action is computer-gen- 
erated content, Farid points out. On 
the consumer side, image altering 
lets people create aesthetically pleas- 
ing photos, so everyone looks good in 
the same photo. 

Schubmehl agrees. “Hollywood has 
been creating fake worlds for people for 
decades and now regular people can do 
it as well,” he says. “There’s a tremen- 
dous use for tools like Photoshop to cre- 
ate exactly the right type of image that 
someone wants for whatever purposes.” 

Whether manipulated images 
should be identified as such is the sub- 
ject of much debate. While it would 
seem to be a no-brainer when it comes 
to their use in legal cases and photo- 
journalism, there are mixed opinions 
about the use of manipulated images 
in industries such as fashion, enter- 
tainment, and advertising. France re- 
cently passed a law stipulating that any 
image showing a model whose appear- 
ance has been altered must feature a 
clear and prominent disclaimer label 
to indicate this is the case, notes So- 
phie Nightingale, a postdoctoral teach- 
ing associate at Royal Holloway Univer- 
sity of London. Those who do not 
comply with the law are subject to a 
fine of 30% of the advertising cost. 
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“Perhaps not too surprisingly, adver- 
tisers and publishers continue to resist 
such legislation and criticize the limita- 
tions it places on free expression and 
artistic freedom,” says Nightingale, who 
completed image manipulation studies 
as part of her Ph.D. at the University of 
Warwick in Coventry, U.K. She adds that 
“many photographers believe that the 
use of image manipulation techniques 
is a positive thing that allows creative 
freedom; in fact, some suggest the abil- 
ity to manipulate images makes a pho- 
tographer more akin to a painter who 
takes something that is real and puts 
their own artistic spin on it.” 

One tricky thing is that “manipu- 
lated” is not an easy word to define, 


observes Farid, and there are a lot of 
gray areas. He advocates that publish- 
ing and media outlets, as well as courts 
of law and scientific journals, should 
adhere to a policy of “you don’t have to 
tag the images, you simply have to 
show me the original.” This, he says, 
“keeps people honest.” That approach 
“allows we, the consumers, to make the 
determination, and bypasses the com- 
plexity of defining what’s appropriate 
and what isn’t,” Farid says. 

Right now, we are unable to know 
for sure when a photo has been altered 
when sophisticated manipulations are 
being used, says Nightingale. She adds, 
however, that there are signs of image 
manipulation that can be identified. 

“Computer scientists working in 
digital forensics and image analysis 
have developed a suite of programs that 
detect inconsistencies in the image, 
perhaps in the lighting,” she says. Her 
work includes conducting research to 
see whether people can make use of 
these types of inconsistencies to help 
identify forgeries. 

Other more general tips Nightingale 
suggests for spotting fake photos in- 
clude using reverse image searches to 
find the image source, looking for re- 
peating patterns in the image (since 
repetition might be a sign that some- 
thing has been cloned) and checking 
the metadata, which provides details 
such as the date and time the photo 


Milestones 


BBVA Recognizes Turing Laureates 


The BBVA Foundation, 
a Spanish organization 
that promotes research, 
advanced training, and the 
transmission of knowledge 
to society, has given its 10" 
Frontiers of Knowledge 
Award in the Information 
and Communication 
Technologies category to 
ACM A.M. Turing Award 
recipients Shafi Goldwasser, 
Silvio Micali, Ronald Rivest, 
and Adi Shamir for their 
“fundamental contributions 
to modern cryptology, an area 
of a tremendous impact on our 
everyday life,” in the words of 
the jury’s citation. 

“Their advanced crypto- 
protocols enable the safe 
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and secure transmission of 
electronic data, ranging from 
email to financial transactions. 
In addition, their work provides 
the underpinning for digital 
signatures, blockchains, and 
cryptocurrencies.” 

The work of Goldwasser, 
Micali, Rivest, and Shamir, 
the citation adds, “is crucial 
to the fabric of our connected 
digital society. Every time 
we log in to social media, 
purchase goods online, or 
vote or sign electronically, 
we leverage the technology 
developed by their research.” 

In 1978, Shamir and 
Rivest, together with Leonard 
Adleman, created the RSA 
algorithm, the “first of 


VOL. 61 NO. 3 


the secure protocols that 
defined the face of modern 
cryptography,” as the jury 
terms it. RSA is a “public-key” 
encryption system because 
each user has two keys: a 
public key, used to encrypt 
the message; and another 
known only to the receiver. 
The encryption process is 
based on a mathematical 
problem intractable for today’s 
computers without the aid of 
the other, private key. RSA is 
still a widely used protocol, 
particularly in combination 
with other techniques. 
Goldwasser and Micali 
in 1982 were working on 
a doctorate course at the 
University of California, 


Berkeley, when they embarked 
on a collaboration whose 

first big result would lay the 
theoretical foundations of 

the field—the mathematical 
demonstration of when 

an encryption method is 
genuinely unbreakable. 

The BBVA jury said 
Goldwasser and Micali, 
together and separately, 

“have expanded the scope 

of cryptography beyond its 
traditional goal of secure 
communication,” with 
developments that have helped 
build today’s flourishing digital 
society by allowing users to 
collaborate, share information, 
and shop online without 
sacrificing security. 
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was taken, camera settings used, and 
location coordinates. 

While digital forensic techniques 
are a promising way to check the au- 
thenticity of photos, for now “using 
these techniques requires an expert 
and can be time-consuming,” Nightin- 
gale says. “What’s more, they don’t 
100% guarantee that a photo is real or 
fake. That said, digital forensic tech- 
niques and our work, which is trying to 
improve people’s ability to spot fake 
images, does at least make it more dif- 
ficult for forgers to fool people.” 

Farid has developed several tech- 
niques for determining whether an 
image has been manipulated. One 
method looks at whether a JPEG im- 
age has been compressed more than 
once. Another technique detects im- 
age cloning, which is done when try- 
ing to remove something from an im- 
age, he says. In addition, Schubmehl 
cites the development of machine 
learning algorithms by researchers at 
New York University to spot counter- 
feit items. 

The mission of a five-year U.S. De- 
fense Advanced Research Projects 
Agency (DARPA) program called Medi- 
For (media forensics) is to use digital 
forensics techniques to build an auto- 
mated system that can accurately an- 
alyze hundreds of thousands of im- 
ages a day, says Farid, who is 
participating in the program. “We’re 
now in the early days of figuring out 
how to scale [the system] so we can do 
things quickly and accurately to stop 
the spread of viral content that is fake 


or has been manipulated,” he says. 
“The stakes can be very, very high, and 
that’s something we have to worry a 
great deal about.” 

That is because a growing number 
of AI tools are increasing the ability for 
fakery to flourish, regardless of how 
they are being used. In 2016, Adobe an- 
nounced VoCo (voice conversion), es- 
sentially a “Photoshop of speech” tool 
that lets a user edit recorded speech to 
replicate and alter voices. 

Face2Face is an Al-powered tool that 
can do real-time video reenactment. 
The technology lets a user “animate the 
facial expressions of the target video by 
a source actor and re-render the ma- 
nipulated output video in a photoreal- 
istic fashion,” according to its creators 
at the University of Erlangen-Nurem- 
berg, the Max Planck Institute for In- 
formatics, and Stanford University. 
When someone moves their mouth 
and makes facial expressions, those 
movements and expressions will be 
tracked and then translated onto 
someone else’s face, making it appear 
that the target person is making those 
exact movements. 

On the flip side is software that helps 
users take preventative measures 
against being duped. One is an AI tool 
called Scarlett that was recently intro- 
duced by adult dating site Saucy- 
Dates, with the goal of reducing fraud 
and scams in the dating industry. 
Scarlett acts as a virtual assistant and 
as people are having live conversa- 
tions, it scores users; when the score 
reaches a threshold, it is flagged and 


news 


read by a moderator. To protect the 
privacy of the conversation, the moder- 
ator can only read the suspected fraud- 
ster’s messages, explains David 
Minns, founder and CEO of software 
developer DM Cubed, which devel- 
oped the SaucyDates tool. He adds 
the AI tool also warns the potential 
victim of fraudulent content. 

Farid says we should absolutely be 
alarmed by the growth of software that 
enables digital media to be manipulated 
into fakes for nefarious purposes. 

“There’s no question that from the 
field of computer vision to computation- 
al photography to computer graphics to 
software that is commercially available, 
we can continue to be able to manipu- 
late digital content in ways that were un- 
imageable a few years ago,” he says. 
“And that trend is not going away.” 
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Privacy and Security 
Making Security 


Sustainable 


Can there be an Internet of durable goods? 


S WE START to connect du- 
rable goods such as cars, 
medical devices, and elec- 
tricity meters to the Inter- 
net, there will be at least 
three big changes. First, security will 
be more about safety than privacy. 
Certification will no longer mean test- 
ing a car once before selling it for 10 
years; safety will mean monthly soft- 
ware updates, and security will be 
an integral part of it. Second, we will 
have to reorganize government func- 
tions such as safety regulators, stan- 
dards bodies, testing labs, and law en- 
forcement. Finally, while you might 
get security upgrades for your phone 
for two or three years, cars will need 
safety and security patches for 20 
years or more. We have no idea how to 
patch 20-year-old software; so we will 
need fresh thinking about compilers, 
verification, testing, and much else. 


Privacy, Availability, or Safety? 

The early security scares about the 
“Internet of Things” have mostly been 
about privacy. There have been re- 
ports of the CIA and GCHQ turning 
smart TVs into room bugs, while the 
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German government banned the Cay- 
la doll whose voice-recognition sys- 
tem could be abused in the same way.’ 
Yet privacy may change less than we 
think. Your car knows your location 
history, sure, but your phone knows 
that already. It also knows where you 
walk, and it is already full of adware. 

Denial of service has also been in 
the news. In October 2016, the Mirai 
botnet used 200,000 CCTV cameras 
(many of them in Brazil and Vietnam) 
to knock out Twitter in the Eastern U.S. 
for several hours. ISPs know they may 
have to deal with large floods of traffic 
from senders with whom they cannot 
negotiate, and are starting to get wor- 
ried about the cost. 

But the most important issue 
in the future is likely to be safety. 
Phones and laptops do not kill a lot 
of people, at least directly; cars and 
medical devices do. 

In 2016, Eireann Leverett, Richard 
Clayton, and I conducted a research 
project for the European Commission 
on what happens when devices that 
are subject to safety regulation start 
to contain both computers and com- 
munications. There are surprisingly 
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many regulated industries; it is not 
just the obvious ones like aircraft and 
railway signals, but even kids’ toys— 
they must not have lead paint, and if 
you pull a teddy bear’s arms off, they 
must not leave sharp spikes. 

So what is the strategic approach? 
We looked at three verticals—road 
vehicles, medical devices, and smart 
meters. Cars are a good example to il- 
lustrate what we learned—though the 
lessons apply elsewhere too. 


Security and Safety for Cars 

Car safety has been regulated for 
years. In the U.S., the National High- 
ways Transportation and Safety Ad- 
ministration was established in the 
1960s following Ralph Nader’s cam- 
paigning; Europe has an even more 
complex regulatory ecosystem. Reg- 
ulators discovered by the 1970s that 
simply doing crash tests and pub- 
lishing safety data were not enough 
to change industry behavior. They 
had to set standards for type approv- 
al, mandate recalls when needed, 
and coordinate car safety with road 
design and driver training. Insurers 
do some of the regulatory work, as 
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do industry bodies; but governments 
provide the foundation. 

This ecosystem faces a big change 
of tempo. At present, a carmaker 
builds a few hundred prototypes, 
sends some for crash testing, has the 
software inspected, and gets certifi- 
cation under more than 100 regula- 
tions. Then it ramps up production 
and sells millions of cars. Occasion- 
ally carmakers have to change things 
quickly. When a Swedish journal- 
ist found that avoiding an elk could 
cause an A-class car model to roll 
over, Mercedes had to redesign its 
suspension and fit a stability control 
system, which delayed the product 
launch at a cost of $200 million.’ But 
most of the safety case is a large up- 
front capital cost, while the time con- 
stant for the design, approval, and 
testing cycle is five years or so. 

In the future, a vulnerability in a 
car will not need a skillful automotive 
journalist to exploit it. Malware can do 
that. So if a car can be crashed by com- 
mands issued remotely over the Inter- 
net, it will have to be fixed. Although 
we have known for years that car soft- 
ware could be hacked, the first widely 


publicized public demonstration that 
a Jeep Cherokee could actually be run 
off the road—by Charlie Miller and 
Chris Valasek, in 2015—showed that 
the public will not tolerate the risk.* 
While previous academic papers on 
car hacking had been greeted with a 
shrug, press photos of the Cherokee 
in a ditch forced Chrysler to recall 1.4 
million vehicles. 

Cars, like phones and laptops, will 
get monthly software upgrades. Tesla 
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has started over-the-air upgrades, 
and other vendors are following suit. 
This will move safety regulation from 
pre-market testing to a safety case 
maintained in real time—a chal- 
lenge for both makers and regula- 
tors. We will need better, faster, and 
more transparent reporting of safety 
and security incidents. We will need 
responsible disclosure—so people 
who report problems are not afraid 
of lawsuits. We will need to shake 
up existing regulators, test labs, and 
standards bodies. Over two dozen 
European agencies have a role in car 
safety, and none of them have any 
cybersecurity expertise yet. We will 
need to figure out where the security 
engineers are going to sit. 

We may need to revisit the argu- 
ment between intelligence agencies 
who want “exceptional access” to sys- 
tems for surveillance, and security ex- 
perts who warn that this is hazardous.! 
The Director of the FBI and the U.K. 
Home Secretary both argue that they 
should be able to defeat encryption; 
they want a golden master key to your 
phone so they can listen in. But how 
many would agree to a golden master 
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key that would let government agents 
crash their car? 

There are opportunities too. Your 
monthly upgrade to your car software 
will not just fix the latest format string 
vulnerability, but safety flaws as well. 
The move to self-driving cars will lead 
to rapid innovation with real safety 
consequences. At present, product re- 
calls cost billions, and manufacturers 
fight hard to avoid them; in the future, 
software patches will provide a much 
cheaper recall mechanism, so we can 
remove the causes of many accidents 
with software, just as we now fix dan- 
gerous road junctions physically. 

But cars will still be more difficult 
to upgrade than phones. A modern 
car has dozens of processors, in every- 
thing from engine control and navi- 
gation through the entertainment 
system to the seats, side mirrors, and 
tire-pressure sensors. The manufac- 
turer will have to coordinate and drive 
the process of updating subsystems 
and liaising with all the different sup- 
pliers. Its “lab car”—the rig that lets 
test engineers make sure everything 
works together—is already complex 
and expensive, and the process is 
about to get more complex still. 


Sustainable Safety and Security 
Perhaps the biggest challenge will be 
durability. At present most vendors 
won’t even patch a three-year-old 
phone. Yet the average age ofa U.K. car 
at scrappage is 14.8 years, and rising 
all the time; cars used to last 100,000 
miles in the 1980s but now keep go- 
ing for nearer 200,000. As the embed- 
ded carbon cost of a car is about equal 
to that of the fuel it will burn over its 
lifetime, a significant reduction in ve- 
hicle durability will be unacceptable 
on environmental grounds. 

As we build more complex arti- 
facts, which last longer and are more 
safety critical, the long-term main- 
tenance cost may become the limit- 
ing factor. Two things follow. First, 
software sustainability will be a big 
research challenge for computer sci- 
entists. Second, it will also be a major 
business opportunity for firms who 
can cut the cost. 

On the technical side, at present 
it is hard to patch even five-year-old 
software. The toolchain usually will 
not compile on a modern platform, 
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leaving options such as keeping the 
original development environment 
of computers and test rigs, but not 
connecting it to the Internet. Could 
we develop on virtual platforms that 
would support multiple versions? 

That can be more difficult than it 
initially appears. Toolchain upgrades 
already break perfectly functional soft- 
ware. A bugbear of security developers 
is that new compilers may realize that 
the instructions you inserted to make 
cryptographic algorithms execute in 
constant time, or to zeroise crypto- 
graphic keys, do not affect the output. 
So they optimize them away, leaving 
your code suddenly open to side-chan- 
nel attacks. (In separate work, Laurent 
Simon, David Chisnall, and I have 
worked on compiler annotations that 
enable a security developer’s intent to 
be made explicit.) 

Carmakers currently think their li- 
ability for upgrades ends five years af- 
ter the last car is sold. But their legal 
obligation to provide spare parts lasts 
for 10 years in Europe; and most of 
the cars in Africa arrive in the country 
secondhand, and are repaired for as 
long as possible to keep them oper- 
able. Once security patches become 
necessary for safety, who is going to 
be writing the patches for today’s cars 
in Africa in 25 years’ time? 

This brings us to the business 
side—to the question of who will 
pay for it all. Markets will provide 
part of the answer; insurance pre- 
miums are now rising because low- 
speed impacts now damage cam- 
eras, lidars, and ultrasonic sensors, 
so that a damaged side mirror can 
cost $1,000 rather than $100. The 
firms that earn money from these 
components have an incentive to 
help maintain the software that 
uses them. And part of the answer 
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will be legal; there have been regula- 
tions in Europe since 2010 that force 
carmakers to provide technical in- 
formation to independent garages 
and spare-parts manufacturers. It 
is tempting to hope that a free/open 
source approach might do some of 
the heavy lifting, but many critical 
components are proprietary, and 
need specialist test equipment for 
software development. We also need 
incentives for minimalism rather 
than toolchain bloat. We do not really 
know how to allocate long-term own- 
ership costs between the different 
stakeholders so as to get the socially 
optimal outcome, and we can expect 
some serious policy arguments. But 
whoever pays for it, dangerous bugs 
have to be fixed. 

Once software becomes pervasive 
in devices that surround us, that are 
online, and that can kill us, the soft- 
ware industry will have to come of 
age. As security becomes ever more 
about safety rather than just privacy, 
we will have sharper policy debates 
about surveillance, competition, 
and consumer protection. The no- 
tion that software engineers are not 
responsible for things that go wrong 
will be put to rest for good, and we will 
have to work out how to develop and 
maintain code that will go on work- 
ing dependably for decades in envi- 
ronments that change and evolve. 
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Pamela Samuelson 


Legally Speaking 
Will the Supreme 
Court Nix Reviews 
of Bad Patents? 


Considering the longer-term implications 
of a soon-to-be-decided U.S. Supreme Court case. 


AD” PATENTS HAVE been a 
plague to many in the soft- 
ware industry. Patents can 
be “bad” for numerous rea- 
sons. Although patents are 
supposed to be available only for new 
and inventive advances, sometimes it 
is difficult for examiners to locate the 
most relevant prior art. Not knowing 
about this art may cause them to ap- 
prove patents that should not have 
been issued. Sometimes claims are 
too abstract or vague to be eligible for 
patents, or are deficient in other ways. 

To address the bad patent problem, 
most developed countries have creat- 
ed administrative procedures so that 
third parties can challenge the validity 
of patents by asking a patent office tri- 
bunal to re-examine the patents. Post- 
grant review procedures are cost-ef- 
fective ways to get rid of bad patents 
without having to go through full- 
dress, multiyear, very expensive litiga- 
tion and appellate review. Patents that 
survive post-grant reviews are “stron- 
ger” for having gone through this ex- 
tra scrutiny. 

In 2011, the U.S. Congress enacted 
the America Invents Act (AIA), which, 
among other things, gave the Patent 
Trial & Appeal Board (PTAB) the pow- 
er to review issued patents and extin- 
guish claims that the board deems 
deficient for lack of novelty or inven- 
tiveness. Those dissatisfied with 
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PTAB rulings can ask the Court of Ap- 
peals for the Federal Circuit (CAFC) to 
review them. 

Greene’s Energy Group was among 
the firms that asked PTAB to review a 
patent being asserted against it. Oil 
States Energy Services had sued it in 
federal court for infringement of this 
patent. PTAB agreed with Greene’s that 
the Oil States patent was invalid on 


lack of novelty grounds. Oil States ap- 
pealed this decision to the CAFC, in 
part by attacking the constitutionality 
of the PTAB review process. 

The CAFC thought so little of Oil 
States’ appeal that it did not even issue 
an opinion to explain its denial. But Oil 
States was not ready to give up. It asked 
the U.S. Supreme Court to review the 
constitutional question. 


MARCH 2018 | VOL. 61 | NO.3 | COMMUNICATIONS OF THE ACM 27 


viewpoints 


ACM Transactions 


on Accessible 
Computing 


ACM TACCESS is a 
quarterly journal that 
publishes refereed 
articles addressing 
issues of computing 
as it impacts the 

lives of people with 
disabilities. The 
journal will be of 
particular interest to 
SIGACCESS members 
and delegates to its 
affiliated conference 
(i.e. ASSETS), as well 
as other international 
accessibility 
conferences. 


For further information 
or to submit your 
manuscript, 
visit taccess.acm.org 


28 COMMUNICATIONS OF THE ACM | MARCH 2018 


To everyone’s astonishment, the 
Court decided in June 2017 to hear Oil 
States’ appeal. All of a sudden, the very 
useful PTAB tool that Congress created 
to enable challenges to “bad” patents 
was at risk of being struck down on 
hoary old constitutional grounds. 


What’s at Stake? 

Oil States is obviously upset at the in- 
validation of its patent on a device and 
method to cap well-heads during hy- 
draulic fracturing (aka fracking) proce- 
dures so the well-heads can withstand 
considerable pressure caused when 
firms pump fluids into oil and gas wells 
to stimulate or increase the production. 

The Oil States lawsuit against 
Greene’s for infringing this patent was 
stayed during the PTAB review. Unless 
the Supreme Court invalidates the 
whole PTAB review procedure on con- 
stitutional grounds, Oil States will lose 
this lawsuit. 

The stakes are, of course, much larg- 
er than the loss that Oil States would 
sustain if the Supreme Court rules 
against it. In its petition asking the 
Court to hear its appeal, Oil States em- 
phasized that PTAB had overturned 
nearly 80% of the patent claims it had 
reviewed. (This affected almost 10,000 
patents as of March 2016.) That may 
sound like a lot, but consider the PTAB 
only reviews patents when it has decid- 
ed the challenges are likely to succeed.) 
The Oil States brief cited one source es- 
timating that PTAB reviews had “de- 
stroyed” $546 billion in value to the 
U.S. economy and “wiped out” $1 tril- 
lion in valuation of companies whose 
patents had been invalidated. 

If PTAB was right that these patents 
should not have issued, one might 
think it is good that companies sued 
for infringement did not have to pay 
$546 billion to license invalid patents. 
The $1 trillion in lower valuation for 
firms whose patents were struck down 
may also be well deserved. That, how- 
ever, is not directly relevant to the con- 
stitutional issue the Court will be grap- 
pling with in the Oil States case. 


Relevant Constitutional Provisions 
Explaining the legal issue before the 
Supreme Court in Oil States requires 
a brief review of key parts of the U.S. 
Constitution. Article I confers numer- 
ous legislative powers on the U.S. Con- 
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gress, including the power to pass laws 
that grant exclusive rights for limited 
times to inventors for their discoveries 
in the useful arts (that is, patents). 

Article II establishes the Executive 
Branch of the U.S. government and 
sets forth powers of the President and 
those who work within that branch of 
the government. 

Article III establishes and gives cer- 
tain powers to federal courts. Section 1 
states: “The judicial Power of the United 
States, shall be vested in one supreme 
Court, and in such inferior Courts as 
the Congress may from time to time or- 
dain and establish.” Section 2 provides 
in relevant part that “[t]he judicial Pow- 
er shall extend to all Cases, in Law and 
Equity, arising under this Constitu- 
tion, the Laws of the United States, and 
Treaties made, or which shall be made, 
under their Authority.” 

Also at issue in Oil States is the Sev- 
enth Amendment to the Constitution, 
one of the 10 amendments to the Con- 
stitution (widely known as the Bill of 
Rights), which were promulgated dur- 
ing the process of state ratification of 
the Constitution to address several civ- 
il liberties concerns raised in debates 
about the original document. 

The Seventh Amendment provides: 
“In Suits at common law, where the val- 
ue in controversy shall exceed twenty 
dollars, the right of trial by jury shall be 
preserved, and no fact tried by a jury, 
shall be otherwise re-examined in any 
Court of the United States, than accord- 
ing to the rules of the common law.” 


Oil States’ Constitutional Attack 
Oil States’ argument, boiled down to 
its essence, is this: Congress created 
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the PTAB as an administrative tribunal 
within of the U.S. Patent and Trade- 
mark Office (PTO). Congress gave 
PTAB the power to adjudicate certain 
challenges to the validity of issued pat- 
ent claims. Under its theory, Oil States 
contends that only Article III courts 
have the constitutional authority to 
adjudicate patent validity. It further 
claims that under the Seventh Amend- 
ment, it has the right to a jury trial on 
its patent claims. 

Oil States relies on an 1898 Su- 
preme Court opinion, McCormick 
Harvesting Machine Co. v. C. Aultman 
& Co., which says, among other 
things, that a patent “is not subject to 
being revoked or cancelled by the 
President, or any other officer of the 
government” because “it has become 
the property of the patentee, and as 
such is entitled to the same legal pro- 
tection as other property.” Because 
courts had, prior to adoption of the 
Constitution, the power to review pat- 
ent validity and infringement, the 
Constitution should be understood to 
have vested Article III courts, and only 
Article III courts, with the power to 
decide whether patents are valid. 

Oil States is not the first firm to have 
raised this particular constitutional 
challenge. Three times previously the 
Supreme Court denied petitions to re- 
view the constitutionality of the PTAB 
power to extinguish patent claims. 
This explains why virtually everyone 
who pays attention to patent law was 
surprised by Court’s decision to review 
the CAFC’s decision in Oil States. 

The Oil States case has many owners 
of weak patents hoping the Court will 
find merit in Oil States’ arguments. 
Many of others are worried that the 
Court will get tangled up in abstract ar- 
guments about whether patents are 
“private” or “public” rights under the 
Court’s complicated jurisprudence 
about powers of Article III courts and 
powers that Congress has to create ad- 
ministrative tribunals to handle cer- 
tain kinds of claims. 


Responses to Oil States’ Claims 

Greene’s had hoped to ward off Su- 
preme Court review by relying on 
the Court’s prior lack of interest in 
hearing this kind of challenge. On 
the merits, it relied upon a 1989 Su- 
preme Court opinion, Gianfinaciera 
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SA v. Nordberg, which opined that 
Congress has the power to decline to 
provide for jury trials to resolve cer- 
tain types of disputes between pri- 
vate parties over statutory rights as 
long as the rights at issue are integral 
to a public regulatory scheme whose 
operations Congress has assigned to 
an administrative agency. 

Greene’s also cited precedents hold- 
ing that patents are “public rights” and 
integrally related to the complex regu- 
latory scheme that Congress assigned 
to the PTO. Because PTAB’s reviews are 
part and parcel of that scheme, 
Greene’s argued that Congress had 
power to assign these reviews to the ex- 
pert agency in charge of this scheme, 
namely, the PTO. 

The PTO also filed a brief in re- 
sponse to Oil States’ petition, point- 
ing out that Congress intended for the 
AIA to “establish a more efficient and 
streamlined patent system that will 
improve patent quality and limit un- 
necessary and counterproductive liti- 
gation” in response to a “growing 
sense that questionable patents are 
too easily obtained and are too diffi- 
cult to challenge.” The PTO brief ar- 
gued that the Court should not thwart 
this laudable goal. 

It noted that Congress had given 
several agencies besides the PTO the 
power to review and correct errors in 
their past decisions. PTO examiners, 
like many other federal agents, some- 
times make mistakes. The PTAB review 
process enables correction of those 
mistakes. 

The PTO challenged Oil States’ read- 
ing of the McCormick decision. The 
holding in that case was that the patent 
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office could not cancel a patent be- 
cause Congress had not given the office 
power to do so. By enacting the AIA, 
however, Congress conferred such 
power on the PTO. The sentences from 
McCormick on which Oil States relied 
were only “dicta.” 


Conclusion 

Constitutional challenges to Congres- 
sional enactments are rarely success- 
ful. The Supreme Court generally gives 
a broad reading to the constitutional 
powers that Congress has to accom- 
plish its goals, including the estab- 
lishment of administrative tribunals. 
Even so, there is reason for PTAB-pro- 
ponents to be worried about the Oil 
States case. The Supreme Court’s juris- 
prudence on the powers of Article III 
courts versus the powers of Congress 
to establish tribunals to resolve private 
disputes is very arcane, convoluted, 
and far from consistent. 

The Court’s job is to interpret the 
Constitution. So the Justices cannot 
just decide that Congress had a good 
reason to establish PTAB and give it 
power to extinguish erroneously 
granted patents. The Justices will have 
to carve a careful path through the 
thicket of their past decisions to artic- 
ulate standards to uphold PTAB’s pat- 
ent reviews, and give guidance about 
Congress’ powers to establish other 
administrative tribunals, such as a 
small claims tribunal within the Copy- 
right Office. 

One can hope the Justices will con- 
sider the constitutional issue in Oil 
States in light of another constitution- 
al purpose, that which underlies the 
patent system: promoting the prog- 
ress of useful arts. If the only way a bad 
patent can get killed is through a $5 
million-$10 million federal court law- 
suit, lots of bad patents will go unchal- 
lenged. PTAB is an efficient mecha- 
nism to extinguish erroneously issued 
patents. It will be unfortunate indeed 
if the Court feels so caught in the web 
of its constitutional precedents that it 
cannot find a way to uphold the good 
work the PTAB has been doing. 
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Computing Ethics 
Ethics Omission Increases 
Gases Emission 


A look in the rearview mirror at Volkswagon software engineering. 


HE VOLKSWAGEN EMISSIONS 
scandal came to light in Sep- 
tember 2015. The company 
installed software into mil- 
lions of vehicles with diesel 

engines so that impressive emission 

readings would be recorded in labora- 
tory conditions even though the reality 
is that the diesel engines do not com- 
ply with current emission regulations. 
Volkswagen is a global organization 
headquartered in Germany; its subsid- 
iaries adhere to common policies anda 
corporate culture. This worldwide 
scandal broke first in the U.S. with on- 
going investigation and legal action 
there and in other countries including 

Germany, Italy, and the U.K. 

Combustion engines are the source 
of pollution and therefore have been 
subjected to emission control. The for- 
mation of NOx (nitrogen oxides) 
through combustion is a significant 
contributor to ground-level ozone and 
fine particle pollution that is a health 
risk. On this basis, the use of software 
to control emissions must be defined 
as safety critical for, if it fails or mal- 
functions, it can cause death or serious 
injury to people. There does not appear 
to be any acknowledgement of this 
across vehicle manufacturing. 

The statement from the U.S. De- 
partment of Justice’® details the facts 
of the VW emissions case. Two senior 
managers, Jens Hadler and Richard 
Dorenkamp appear to be at the center 
of the so-called defeat software’s on- 
going design and implementation 
processes. These began in 2006, with 
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the design of a new diesel engine to 
meet stricter U.S. emission standards 
to take effect in 2007. The goal was to 
market new vehicles as meeting the 
stricter standards and attract U.S. buy- 
ers. Being unable to accomplish this, 
the engineers working under Hadler 
and Dorenkamp, developed software 
that allowed vehicles to distinguish 
test mode from drive mode thus satis- 
fying the emissions test while allow- 
ing much greater emissions when ve- 
hicles were on the road. “Hadler 
authorized Dorenkamp to proceed 
with the project knowing that only the 
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use of the defeat device software 
would enable VW diesel vehicles to 
pass U.S. emissions tests.” 

Drawing upon the Statement of 
Facts, Leggett? reported that although 
there had been some concerns over 
the propriety of the defeat software all 
those involved in the discussions in- 
cluding engineers were instructed not 
to get caught and furthermore to de- 
stroy related documents. According to 
Mansouri,’ Volkswagen is an autocratic 
company with a reputation for avoid- 
ing dissent and discussion. It has a 
compliant business culture where em- 
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ployees are aware that underperfor- 
mance can result in replacement and 
so management demands must be 
met to ensure job security. Three state- 
ments in particular in the Volkswagen 
Group Code of Conduct: Promotion of In- 
terests, Secrecy, and Responsibility for 
Compliance,” align with the ongoing 
conduct encouraged during the emis- 
sions debacle. Trope and Ressler” ex- 
plain that as an autocratic book of 
rules, the group code supports and 
even promotes dishonest dysfunc- 
tional behavior that includes the cre- 
ation of software to cheat, rather than 
solve, engineering problems and to 
protect that software from disclosure 
as if it were a trade secret. 

In January 2017, the U.S. Justice De- 
partment announced that, “Volkswa- 
gen had agreed to plead guilty to three 
criminal felony counts, and pay a $2.8 
billion criminal penalty, as a result of 
the company’s long-running scheme 
to sell approximately 590,000 diesel 
vehicles in the U.S. by using a defeat 
device to cheat on emissions tests man- 
dated by the Environmental Protection 
Agency (EPA) and the California Air Re- 
sources Board (CARB), and lying and ob- 
structing justice to further the scheme.” 


Business Analysis 

Many of the accounts about the Volk- 
swagen emissions case focus on busi- 
ness ethics with only a few touch- 
ing upon the role of the software 
engineers in this situation. These 
accounts at times are repetitive but 
intertwine to provide a rich view. The 
widespread unethical actions across 
Volkswagen can be described as a new 
type of irresponsible behavior, namely 
deceptive manipulation.” The detail of 
this and the associated corporate re- 
percussions are discussed further by 
Stanwick and Stanwick." 

Software engineers at Volkswagen 
faced ethical and legal issues that are 
easy to identify. Plant® suggests that 
they should have alerted external bod- 
ies since the internal lines of reporting 
were compromised. Merkel’ concurs, 
citing the Software Engineering Code of 
Ethics and Professional Practice (see 
http://www.acm.org/about/se-code) by 
way of justification, and adds that the 
lack of whistleblowers in such a large 
group is surprising. Both authors point 
to the potential personal cost of whis- 
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tleblowing as the reason it did not hap- 
pen. Rhodes’ adds a second factor, ar- 
guing that corporate business ethics is 
very much a pro-business stance that is 
implemented through corporate con- 
trol and compliance systems, and in- 
struments of managerial coordination. 
This can enable the pursuit of business 
self-interest through organized wide- 
spread conspiracies involving lying, 
cheating, fraud, and lawlessness. This 
is what happened at Volkswagen. 
Queen” concurs, explaining that Volk- 
swagen intentionally deceived those to 
whom it owed a duty of honesty. The 
pressure for continuous growth and the 
perception that failure was not an op- 
tion’ created a culture where corporate 
secrecy was paramount—which in turn 
implicitly outlawed whistleblowing. 


The Role of Software Engineering 
If one has a responsibility for the plan- 
ning, design, programming, or imple- 
mentation of software then that aspect 
of one’s work falls within the scope of the 
Software Engineering Code of Ethics and 
Professional Practice regardless of one’s 
job title. In that sense software engineer- 
ing pervades this debacle and is there- 
fore worthy of further investigation. 

So what was the role of software engi- 
neers in the creation and installation of 
VW’s defeat software? This question 
can be addressed using the Software En- 
gineering Code of Ethics and Professional 
Practice. The code is long established, 
documenting the ethical and profes- 
sional obligations of software engineers 
and identifying the standards society 
expects of them.’ The code translates 
ethical principles into practical guid- 
ance. It encourages positive action and 
resistance to act unethically. It has 
been adopted by many professional 
bodies and companies worldwide and 
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has been translated into Arabic, Croa- 
tian, French, German, Hebrew, Italian, 
Mandarin, Japanese, and Spanish (see 
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Software engineers and software 
engineering educators have a respon- 
sibility to be cognizant of the code and 
its requirements. Public Interest, 
which is apposite for safety-critical 
software, is central to the code. Al- 
though education can influence the 
courage and capability to act in accor- 
dance with the code, that result de- 
pends on structural and psychological 
supports within the environment in 
which engineers practice. 

The actions of VW managers and 
software engineers violated the follow- 
ing principles of the code: 

Principle 1.03 “approve software 
only if they have a well-founded belief 
that it is safe, meets specifications, 
passes appropriate tests, and does not 
diminish quality of life, diminish pri- 
vacy, or harm the environment. The ul- 
timate effect of the work should be to 
the public good.” The defeat software 
is clearly unsafe given NOx pollution 
damages both health and the environ- 
ment. The public were under the mis- 
apprehension that VW cars were emit- 
ting low levels of NOx and therefore not 
a health risk. Thus, software engineers 
installed unethical software. 

Principle 1.04 “disclose to appro- 
priate persons or authorities any ac- 
tual or potential danger to the user, 
the public, or the environment, that 
they reasonably believe to be associ- 
ated with software or related docu- 
ments.” There is no evidence that any 
software engineer disclosed. Com- 
mercial software is usually developed 
in teams and in this case it is likely 
this was a large team spanning all as- 
pect of software development. 

Principle 1.06 “be fair and avoid de- 
ception in all statements, particularly 
public ones, concerning software or re- 
lated documents, methods and tools.” 
The emissions software was heralded 
publically as a success when internally 
there was widespread knowledge that 
this claim was fraudulent. Software en- 
gineers were likely to have been privy to 
this cover-up. 

Principle 2.07 “identify, docu- 
ment, and report significant issues of 
social concern, of which they are 
aware, in software or related docu- 
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ments, to the employer or the client.” 
There is some evidence that concern 
was raised about the efficacy of the 
defeat software but it seems those in 
dissent allowed themselves to be 
managed towards deception. 

Principle 3.03 “identify, define and 
address ethical, economic, cultural, le- 
gal and environmental issues related 
to work projects.” The EPA regulations 
are explicit and are legally binding. 
From the evidence accessed it is un- 
clear as to whether software engineers 
knew of the illegality of their actions. 
Nevertheless ignorance cannot and 
must not be a form of defence. 

Principle 6.06 “obey all laws govern- 
ing [the] work, unless, in exceptional 
circumstances, such compliance is in- 
consistent with the public interest.” 
This relates to the analysis under prin- 
ciple 3.03. Compliance to further the 
prosperity of Volkswagen was at the ex- 
pense of legal compliance. 

Principle 6.07 “be accurate in stat- 
ing the characteristics of software on 
which they work, avoiding not only 
false claims but also claims that might 
reasonably be supposed to be specula- 
tive, vacuous, deceptive, misleading, or 
doubtful.” Software engineers could 
argue internally that the software in- 
deed performed as it was designed to. 
However, the design was to achieve 
regulatory and public deception. 

Principle 6.13 “report significant vi- 
olations of this Code to appropriate 
authorities when it is clear that consul- 
tation with people involved in these 
significant violations is impossible, 
counterproductive or dangerous.” Giv- 
en the apparent corporate culture 
within Volkswagen there was little 
point in reporting concerns further up 
the line. In fact the corporate code 
seems at odds with the professional 
code regarding this point. Software en- 
gineers failed to report these breaches 
to appropriate authorities. 


Conclusion 

Professionals, who must have been 
party to this illegal and unethical act, 
developed and implemented this soft- 
ware. Those who undertake the plan- 
ning, development, and operation of 
software have obligations to ensure 
integrity of output and overall to con- 
tribute to the public good." The ethi- 
cal practice of software engineers is 
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paramount. Practice comprises proc- 
ess and product. Process concerns vir- 
tuous conduct of software engineers, 
whereas product concerns whether 
software is deemed to be ethically vi- 
able. Actions and outcomes in the 
Volkswagen case appear to have failed 
on both counts. 

These serious issues related to pro- 
fessional practice must be addressed. 
It is hoped such issues are exceptional 
but sadly it is likely they are common- 
place given the ongoing plethora of 
software disasters (see, for example, 
Catalogue of Catastrophe’ and Soft- 
ware Fail Watch"). Unethical actions 
related to software engineering can be 
addressed from two sides. One side fo- 
cuses on resisting the temptation to 
perform unethical practice while the 
other focuses on reducing the oppor- 
tunity of performing unethical prac- 
tice. Society at large needs competent, 
ethical, and altruistic professionals to 
deliver societally acceptable, fit-for- 
purpose software. Both of these can be 
helped by education, but education 
will not suffice without adequate so- 
cial supports. 

In order to fulfill software engi- 
neering duties, an individual must 
fully understand the professional re- 
sponsibilities and obligations of the 
role. These are explicitly laid out in 
the Software Engineering Code of Ethics 
and Professional Practice and as such 
individuals must know and apply it to 
their everyday work. To achieve this, 
the effective education of new profes- 
sionals is essential. Teaching technol- 
ogy in isolation is unacceptable and 
dangerous. Software engineers need 
a broader education to gain the nec- 
essary skills and knowledge to act in 
a socially responsible manner not on 
the basis of instinct and anecdote but 
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on rigor and justification." They must 
possess practical skills to address the 
complex ethical and societal issues 
that surround evolving and emerging 
technology. Such education should 
be based on a varied diet of participa- 
tive experiential learning delivered 
by those who have a practical under- 
standing of the design, development, 
and delivery of software. Contrasting 
the Volkswagen Group Code of Con- 
duct with the Software Engineering 
Code might provide one means for 
experiential learning. Such educated 
software engineers might find ways to 
prevent the installation of unethical 
software of the future. 
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The Profession of IT 
The Computing 


Profession 


Taking stock of progress toward a computing profession 


since this column started in 2001. 


E STARTED THIS column 
in 2001 when ACM was 
re-envisioning itself as 
a society of a comput- 
ing profession. ACM 
leaders and many members already 
thought of computing as a profession. 
They wanted ACM to strengthen its 
support of computing professionals 
and its commitment to the practitioners 
of a computing profession. How has all 
this progressed in the past 17 years? 
In my first column on the IT profes- 
sion, my opening question was whether 
a profession is needed in the first place. 
I wrote: “To most of the hundreds of 
millions of computer users around the 
world, the inner workings of a comput- 
er are an utter mystery. Opening the box 
holds as much attraction as lifting the 
hood of a car. Users look to computing 
professionals to help them with their 
needs for designing, locating, retriev- 
ing, using, configuring, programming, 
maintaining and understanding com- 
puters, networks, applications, and dig- 
ital objects.”! The need has intensified 
over the years because there are now 
billions of users and the technologies 
they rely on are much more complex. 
The ACM and IEEE Computer Soci- 
ety are the two main professional soci- 
eties in computing. They are compara- 
ble in size with approximately 100,000 
members each. ACM has traditionally 
emphasized the science-math side of 
computing, and IEEE-CS the engineer- 
ing-design side of computing. The two 
societies have cooperated on many 


joint ventures including curriculum 
recommendations and accreditation. 
They have diverged on certification 
and licensing, which traditionally have 
been eschewed by ACM leadership and 
embraced by IEEE-CS leadership. 

The next question was what special- 
ties have professionals organized to 
deal with specific kinds of concerns— 
for example, specialists in program- 
ming languages, operating systems, 
networks, or graphics. The ACM SIGs 
and IEEE-CS hosted organizations to 
support these groups. ACM had (and 
still has) around 40 SIGs in special- 
ized areas. The list of SIGs is a useful 
guide to the organized core specialties 
of computing. 

However, the list of ACM and IEEE- 
CS specialty organizations does not 
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come close to covering all the orga- 
nized specialties in computing. There 
are two other categories—computing- 
intensive fields in science and engi- 
neering where computing is a tool but 
is not the focus of concern, and com- 
puting-infrastructure occupations, 
where specialists operate and main- 
tain the infrastructures on which ev- 
eryone depends. Table 1 is an update 
of the original table,'? now showing 
52 specialties. 

Table 1 reflects the interests of the 
members of ACM and IEEE-CS. How- 
ever, this is not the only way to catego- 
rize computing professionals. The U.S. 
Bureau of Labor statistics maintains 
a list of “computer and information 
technology occupations” that spells 
out the kinds of jobs employers recruit 
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for. Table 2 summarizes the BLS infor- 
mation about computing.* Table 1 can 
be seen as an enumeration of the spe- 
cialties covered by the groups listed in 
Table 2. No matter how you look at it, 
there is a huge market for computing 
professionals. 

Between them, the ACM, IEEE-CS, 
and a large network of education insti- 
tutions provide an extensive support 
structure for computing professionals, 
including these elements:?* 

> Curricula that grant entry into 
profession 

> Standards for curricula (body of 
knowledge) 

> Standards for professional practice 

> Professional development (short 
courses, books) 

> Accreditation guidelines and eval- 
uation 

> Certification 

> Licensing 

> Code of ethics 

> Professional specialty groups 

The professional societies do a lot 
of work to support computing profes- 
sionals! 


ACM and the Profession 

The ACM’s founders in 1947 believed 
that computers would be a permanent 
source of attention and concern, even- 
tually permeating all fields. ACM’s 
initial responses to this concern were 
a computing research journal (Jour- 
nal of the ACM), a newsletter (Com- 
munications of the ACM), and a net- 
work of local chapters. Beginning in 
the early 1960s, ACM developed and 
maintained a code of ethics. The first 
computer science departments were 
founded in 1962 and ACM issued its 
first curriculum recommendations in 
1968. Around 1985, ACM began part- 
nering with IEEE-CS on accreditation 
and upgrades to curriculum recom- 
mendations. Over the years, ACM/ 
IEEE-CS have evolved successively 
more sophisticated curriculum recom- 
mendations into a “computing body of 
knowledge.” ACM has supported pro- 
fessionals in developing competencies 
through its professional development 
center and ongoing discussions about 
good professional practice in Commu- 
nications and ACM Queue magazines. 


a_https://www.bls.gov/ooh/computer-and-infor- 
mation-technology/home.htm 
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Computing-Core 
Disciplines 


Artificial intelligence 


Computing-Intensive 
Fields 


Aerospace engineering 


Table 1. Selected professional specialties of computing. 


Computing-Infrastructure 
Occupations 


Blockchain administrator 


Cloud computing 


Autonomous systems 


Computer technician 


Computer science 


Bioinformatics 


Data analyst 


Computer engineering 


Cognitive science 


Data engineer 


Computational science 


Cryptography 


Database administrator 


Database engineering 


Computational science 


Help desk technician 


Computer graphics 


Data science 


Identity theft recovery agent 


Cyber security 


Digital library science 


Network technician 


Human-computer interaction 


E-commerce 


Professional IT trainer 


Network engineering 


Genetic engineering 


Reputation manager 


Programming languages 


Information science 


Security specialist 


Programming methods 


Information systems 


System administrator 


Operating systems 


Public Policy and Privacy 


Web identity designer 


Performance engineering 


Instructional design 


Web programmer 


Robotics 


Knowledge engineering 


Web services designer 


Scientific computing 


Management information 


systems 


Software architecture 


Network science 


Software engineering 


Multimedia design 


Telecommunications 


Table 2. BLS occupations. 


Category Entry Degree Median Salary in 2016 
Computer and Information Research Scientists MS $112K 
Software Developers BS $102K 
Computer Network Architects BS $101K 
Information Security Analysts BS $93K 
Computer Systems Analysts BS $87K 
Database Administrators BS $85K 
Computer Programmers (coders) BS $80K 
Network and Computer System Administrators BS $80K 
Web Developers AS S66K 
Computer Support Specialists BS or AS $52K 


Source: bls.gov 


Since the 1960s, ACM has hosted more 
than 40 special interest groups (SIGs) 
in various specialties of computing. 

In 2002, ACM established a Profes- 
sion Board, later renamed the Practi- 
tioner Board, to establish and maintain 
programs for the professional develop- 
ment of the 70% of members who are 
practicing, nonacademic profession- 
als. The Practitioner Board today of- 
fers an increasing range of programs to 
support professionals, including pro- 
fessional development, books, market- 
ing, distinguished speakers, practitio- 
ner-oriented conferences, and GPAC 
(global practitioner advisory council), 
a global network of about 100 profes- 
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sionals who provide advice and ideas to 
the Board. The Practitioner Board part- 
ners with relevant ACM publications, 
for example in Queue and in Ubiquity’s 
advisory panel of young professionals. 


Challenges 

In 2001, I wrote that the biggest chal- 
lenge for ACM in fostering a profession 
would be academic computer scientists 
giving up the illusion they could either 
control the profession or be seen as 
its leaders. Computing was spreading 
prolifically and professionals in other 
fields were independently organizing 
professional groups to support them. A 
conspicuous example was the compu- 


tational sciences, such as computation- 
al physics, computational chemistry, 
or computational biology, where pro- 
fessional groups were being organized 
independent of ACM or IEEE. At the 
time, computer scientists were not very 
open to reaching out to other fields and 
helping them meet their own needs in 
computing. Over the years, ACM has 
embraced this challenge and has be- 
come much better at reaching out to 
other fields. ACM has settled into a role 
as a “curator of the flame,” providing 
the definitions, bodies of knowledge, 
and standards of practice for comput- 
ing wherever it appears. 

One of the consequences of the 
spread of computing into everyone’s 
lives is that the public mood has been 
shifting to the notion that essential 
knowledge to support education and 
advancement of the profession should 
be free to the public. The ACM Digital 
Library, which has been feeling the pres- 
sure of this mood for several years, now 
supports open access to research papers 
for which the authors have paid an up- 
front open access fee. The library itselfis 
available to more practicing profession- 
als because ACM grants access through 
licenses to organizations. These chang- 
es have reduced revenues for digital 
library and other publication subscrip- 
tions. While it does not have a final an- 
swer, ACM has been making good head- 
way toward finding revenues to support 
its knowledge base in a way that it can 
ultimately be free to the public. 

But ACM’s biggest challenge con- 
cerns the relations between two major 
sectors of its members. The “academ- 
ic-research” sector is members who 
are on the faculty of universities and 
colleges or are employed by industry 
research labs. The “practitioner” sec- 
tor is members who are practicing non- 
academic professionals, either self 
employed or employed by a company. 
ACM sometimes uses the term “indus- 
try professional” for practitioner. 

One aspect of this challenge is bridg- 
ing the gap between ACM’s treasure 
trove of research papers (in the digital 
library) and the working worlds of prac- 
titioners. Most research papers are com- 
munications among researchers that 
enable the advancement of a research 
field. Many practitioners, however, find 
these papers difficult at best and opaque 
at worst. Bridging the gap means find- 


ing authors who understand both 
worlds and can translate the key ideas 
of research into useful ideas for prac- 
tice. This is quite difficult because few 
such authors exist. A fine example is the 
The Morning Paper, a five-times weekly 
blog by Adrian Colyer in the U.K. (http:// 
blog.acolyer.org); in each issue, Colyer 
translates a research paper into terms 
and connections that practitioners can 
use. Colyer has a relationship with ACM 
through the Practitioners Board. 

Another aspect of this challenge is 
in leadership: ACM has lost its ability to 
populate its leadership positions with 
a mixture of academic-research and 
practitioner professionals so that these 
two worlds will get to know each other 
and work together for a stronger pro- 
fession. All members of the ACM Coun- 
cil, and most of the SIG leadership, are 
from the academic-research sector. 
ACM has been particularly good at sup- 
porting its professional academic and 
research members with first-rate, wide- 
ly respected publications, conferences, 
and awards. ACM has been less atten- 
tive to helping practitioners develop 
their professional skills, at articulating 
standards for essential professional 
skills, or at developing awards and oth- 
er recognitions for industry profession- 
als. The ACM Nominating Committee 
has its work cut out for it. 

ACM could do much more in recog- 
nizing practitioner members for their 
contributions. Most ACM awards to- 
day go to members of the academic-re- 
search sector. The Distinguished Service 
Award, initially chartered in the 1960s as 
an industry professional award co-equal 
in prestige to the Turing Award, has 
faded into semi-obscurity and is now the 
only ACM award with no purse; it could 
be rejuvenated as a major recognition 
for senior practitioner members. ACM 
could also set up new awards explicitly 
for the practitioner sector. 

Although ACM has major programs 
for practitioners—including profes- 
sional development, learning center, 
and Queue magazine—practitioner 
members frequently tell us that ACM 
does not understand them. The Prac- 
titioner Board, under the leadership 
of Terry Coata and Stephen Ibaraki, 
has begun to turn this around, with 
50 practitioner volunteers helping the 
board and another 100 providing ad- 
vice through the GPAC network. 
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In my opinion, however, ACM will 
not achieve its goal of supporting a 
computing profession and its practi- 
tioners without a concerted effort to 
bring practitioners into ACM leader- 
ship positions and give them more 
professional recognitions. IEEE-CS 
has been more successful with this 
than ACM. 


Self-Management 

Individual members further their pro- 
fessional development by using the 
services and support structures of ACM 
and IEEE-CS, and by improving their 
own personal practices in their pro- 
fessional relations with clients. I have 
aimed the 64 The Profession of IT col- 
umns published since 2001 to support 
the latter. These columns have exam- 
ined various aspects of the profession 
including the nature of the profession, 
education for professionals, innova- 
tion, language-action, the Internet, 
software, moods, jobs, and time man- 
agement. You can find them on my 
website.” Please contact me with your 
questions or issues that I can address 
in future columns. 

Computing has come a long way in 
the 70 years since its founders’ first in- 
klings that computing would become 
a pervasive professional concern. ACM 
has developed an impressive array of 
offers in publications, a digital library, 
conferences, chapters, support for 
professional education, support for 
practitioners, and awards. Its biggest 
challenge is integrating its academic- 
research and practitioner sectors. I ex- 
pect significant progress on this chal- 
lenge in the next decade. 


b http://denninginstitute.com/pjd/PUBS/CACMcols 
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Impediments with 
Policy Interventions to 
Foster Cybersecurity 


A call for discussion of governmental investment 
and intervention in support of cybersecurity. 


HE LIST OF cyberattacks hav- 
ing significant impacts is 
long and getting longer, 
well known, and regularly 
invoked in calls for ac- 
tion. Such calls are not misplaced, 
because society is becoming more 
dependent on computing, making 
cyberattacks more capable of wide- 
spread harm. Vardi’s recent call’ “it 
is time to get government involved, 
via laws and regulations” motivates 
this Viewpoint. Indeed, we do know 
how to build more-secure systems 
than we are deploying today. And gov- 
ernments can—through regulation or 
other mechanisms—incentivize ac- 
tions that individuals and organiza- 
tions are otherwise unlikely to pursue. 
However, a considerable distance 
must be traversed from declaring that 
government interventions are needed 
to deciding particulars for those inter- 
ventions, much less intervening. To 
start, we need to agree on specific goals 
to be achieved. Such an agreement re- 
quires understanding monetary and 
other costs that we as a society are will- 
ing to incur, as well as understanding 
the level of threat to be thwarted. Only 
after such an agreement is reached, 
does it make sense for policymakers to 
contemplate implementation details. 
This Viewpoint reviews interven- 
tions often suggested for incentiviz- 
ing enhanced cybersecurity. I discuss 
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the trade-offs involved in the adop- 
tion of each. In so doing, I hope to fa- 
cilitate discussions that will lead to 
agreements about goals and costs. It is 
premature to advocate for specific in- 
terventions, exactly because those dis- 
cussions have yet to take place. 


Secure Systems 

Are More Expensive 

Assurance that a system will do what it 
should and will not do what it should 
not requires effort during develop- 
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ment. Somebody must pay. It could be 
consumers (through higher prices), 
government (through tax credits or 
grants), or investors (if developers will 
accept reduced profits). But realize that 
the consumers, taxpayers, and inves- 
tors are just us. So before mandating 
expenditures for enhanced cybersecu- 
rity, we must decide that we are will- 
ing to pay and decide how much we are 
willing to pay. 

Other priorities will compete. Some 
will advocate using “return on invest- 
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ment” (ROI) to set spending levels for 
cybersecurity versus other priorities. 
But ROI is problematic as a basis for 
justifying how much to spend here. 

> There are no good ways to quantify 
how secure a system is. Measuring cy- 
bersecurity can be as difficult as estab- 
lishing assurance for a system in the 
first place, which we know to be a hard 
problem for real systems. 

> There are no good ways to quantify 
the costs of not investing in cybersecu- 
rity. To tally lost business or the work 
to recover data and systems ignores 
other, important harms from attacks. 
Disclosure of confidential informa- 
tion, for example, can destroy reputa- 
tions, constrain future actions, or un- 
dermine advantages gained through 
technological superiority. Externali- 
ties also must be incorporated into 
a cost assessment—attacks can have 
both local and remote impact, because 
the utility of an individual computer 
often depends on, or is affected by, an 
entire network. 

We should be mindful, though, that 
investments directed at other national 
priorities—defense, foreign aid, and 
social programs—are also difficult 
to evaluate in purely objective ways. 
Yet governments routinely prioritize 
across making such investments. Even 
in smaller, private-sector institutions, 
the “bottom line” is rarely all that mat- 
ters, so they too have experience in 
making investment decisions when 
ROI or other objective measures are 
not available. 

Any given intervention to encour- 
age investing in cybersecurity will allo- 
cate costs across various sectors and, 
therefore, across different sets of indi- 
viduals. A decision to invest in the first 
place might well depend on specifics 
of that allocation. We often strive to 
have those individuals who benefit the 
most be the ones who pay the most. 
But the nature of networked infra- 
structures makes it difficult to charac- 
terize who benefits from cybersecurity 
and by how much. For instance, civil 
government (and much of defense), 
private industry, and individuals all 
share the same networks and use the 
same software, so all benefit from the 
same security investments. Externali- 
ties also come into play. For example, 
should only the targeted political party 
be paying to prevent cyberattacks that, 


E 
The nature 

of networked 
infrastructures 
makes it difficult 

to characterize 

who benefits from 
cybersecurity. 


if successful, threaten the integrity of 
an election outcome? 

Investments in cybersecurity will 
have to be recurring. Software, like a 
new bridge or building, has both an 
initial construction cost and an ongo- 
ing maintenance cost. It is true that 
software does not wear out. Neverthe- 
less, software must be maintained: 

> Today’s approaches for establish- 
ing assurance in the systems we build 
have limitations. So some vulnerabili- 
ties are likely to remain in any system 
that gets deployed. When these vulner- 
abilities are discovered, patches must 
be developed and applied to systems 
that have been installed. 

>» Unanticipated uses and an en- 
vironment that evolves by accretion 
mean that assumptions a system devel- 
oper will have made might not remain 
valid forever. Such assumptions con- 
stitute vulnerabilities, creating further 
opportunities for attackers. 

Ideally, systems will be structured to 
allow patching, and software produc- 
ers will engage in the continuing ef- 
fort to develop patches. Some business 
models (for example, licensing) are 
better than others (for example, sales) 
at creating the income stream needed 
to support that patch development. 


Cost Is Not the Only Disincentive 
Secure systems tend to be less con- 
venient to use, because enforcement 
mechanisms often intrude on usability. 
>» One common approach for ob- 
structing attacks is based on monitor- 
ing. The system authenticates each 
request before it is performed and 
uses the context of past actions when 
deciding what requests are authorized 
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to proceed. But user authentication re- 
quires (tedious) user interactions with 
the system; program authentication 
limits which software can be run on a 
system; and the role of context can lim- 
it a user’s flexibility in how tasks might 
be accomplished. 

> Another common approach to de- 
fense is isolation. Here, effects of ac- 
tions by users, programs, or machines 
are somehow contained. Isolation 
might be employed to keep attackers 
out or to keep attackers in. In either 
case, communications is blocked, 
which makes orchestrating coopera- 
tion difficult. We might, for example, 
facilitate secure access to a bank ac- 
count by requiring use of a Web brows- 
er that is running in a separate (real or 
virtual) computer on which there is a 
separate file system and only certain 
“safe” application programs are avail- 
able. The loss of access to other files or 
programs hinders attackers but it also 
hinders doing other tasks. 

These enforcement mechanisms 
increase the chances that malicious ac- 
tions will be prevented from executing, 
because they also block some actions 
that are not harmful. And users typi- 
cally feel inconvenienced when limita- 
tions are imposed on how tasks must 
be accomplished. So nobody will be 
surprised to learn that users regularly 
disable enforcement mechanisms— 
security is secondary to efficiently get- 
ting the job done. 


Security Can Be in Tension 

with Societal Values 

Enhanced level of cybersecurity can 
conflict with societal values, such as 
privacy, openness, freedom of expres- 
sion, opportunity to innovate, and ac- 
cess to information. Monitoring can 
undermine privacy; authentication of 
people can destroy anonymity; authen- 
tication of programs prevents change, 
which can interfere with flexibility in 
innovation and can be abused to block 
execution of software written by com- 
petitors. Such tensions must be re- 
solved when designing interventions 
that will promote increased levels of 
cybersecurity. 

Moreover, societal values differ 
across countries. We thus should not 
expect to formulate a single uniform set 
of cybersecurity goals that will serve for 
the entire Internet. In addition, the ju- 
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risdiction of any one government nec- 
essarily has a limited geographic scope. 
So government interventions designed 
to achieve goals in some geographic 
region (where that government has 
jurisdiction) must also accommodate 
the diversity in goals and enforcement 
mechanisms found in other regions. 


Flawed Analogies Lead 

to Flawed Interventions 

Long before there were computers, lia- 
bility lawsuits served to incentivize the 
delivery of products and services that 
would perform as expected. Insurance 
was available to limit the insured’s 
costs of (certain) harms, where the for- 
mulation and promulgation of stan- 
dards facilitated decisions by insurers 
about eligibility for coverage. Finally, 
people and institutions were discour- 
aged from malicious acts because their 
bad behavior would likely be detected 
and punished—deterrence. 

Computers and software comprise 
a class of products and services, at- 
tackers are people and institutions. 
So it is tempting to expect that liabil- 
ity, insurance, and deterrence would 
suffice to incentivize investments to 
improve cybersecurity. 

Liability. Rulings about liability for 
an artifact or service involve compari- 
sons of observed performance with 
some understood basis for acceptable 
behaviors. That comparison is not 
possible today for software security, 
since software rarely comes with full 
specifications of what it should and 
should not do. Software developers 
and service providers shun provid- 
ing detailed system specifications be- 
cause specifications are expensive to 
create and could become an impedi- 
ment to making changes to support 
deployment in new settings and to 
support new functionality. Having a 
single list that characterizes accept- 
able behavior for broad classes of 
systems (for example, operating sys- 
tems or mail clients) also turns out to 
be problematic. First, by its nature, 
such a list could not rule out attacks 
to compromise a property that is spe- 
cific only to some element in the class. 
Second, to the extent that such a list 
rules out repurposing functionality 
(and thereby blocks certain attacks), 
the list would limit opportunities for 
innovations (which often are imple- 
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Secure systems 

tend to be less 
convenient to use 
because enforcement 
mechanisms often 
intrude on usability. 


mented by repurposing functionality). 

Insurance. Insurance depends for 
pricing on the use of data about past 
incidents and payouts to predict fu- 
ture payouts. But there is no reason 
to believe that past attacks and com- 
promises to computing systems are 
a good predictor of future attacks or 
compromises. I would hope succes- 
sive versions of a given software com- 
ponent will be more robust, but that 
is not guaranteed. For example, new 
system versions often are developed 
to add features, and a version that 
adds features might well have more 
vulnerabilities than its predecessor. 
Moreover, software deployed in a large 
network is running in an environment 
that is likely to be changing. These 
changes—which might not be under 
the control of the developer, the user, 
the agent issuing insurance, or even 
any given national government— 
might facilitate attacks, and that fur- 
ther complicates the use of historical 
data for predicting future payouts. 

Companies that offer insurance can 
benefit from requiring compliance 
with industrywide standards since the 
domain of eligible artifacts is now nar- 
rowed, which simplifies predictions 
about possible adverse incidents and 
payouts. Good security standards also 
will reduce the likelihood of adverse 
incidents. However, any security stan- 
dard would be equivalent to a list of ap- 
proved components or allowed classes 
of behavior. Such a list only can rule 
out certain attacks and it can limit op- 
portunities for innovation, so security 
standards are unlikely to be popular 
with software producers. 

Deterrence. Finally, deterrence is 
considerably less effective in cyber- 
space than in the physical world. De- 
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terrence depends on being able to attri- 
bute acts to individuals or institutions 
and then punish the offenders. 

> Attribution of attacks delivered 
over a network is difficult, because 
packets are relayed through multiple 
intermediaries and, therefore, pur- 
ported sources can be spoofed or re- 
written along the way. Attribution thus 
requires time-consuming analysis of 
information beyond what might be 
available from network traffic. 

>» Punishment can be problematic 
because attackers can work outside 
the jurisdiction of the government 
where their target is located. To limit 
or monitor all traffic that is destined 
to the hosts within some govern- 
ment’s jurisdiction can interfere with 
societal values such as openness and 
access to information. Such monitor- 
ing also is infeasible, given today’s net- 
work architecture. 


Making Progress 

The time is ripe to be having discus- 
sions about investment and govern- 
ment interventions in support of cyber- 
security. How much should we invest? 
And how should we resolve trade-offs 
that arise between security and (other) 
societal values? It will have to be na- 
tional dialogue. Whether or not com- 
puter scientists lead, they need to be 
involved. And just as there is unlikely 
to be a single magic-bullet technology 
for making systems secure, there is un- 
likely to be a magic-bullet intervention 
to foster the needed investments. 
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Responsible Research 
with Crowds: Pay Crowdworkers 
at Least Minimum Wage 


High-level guidelines for the treatment of crowdworkers. 


ROWDSOURCING IS INCREAS- 
INGLY important in scientif- 
ic research. According to 
Google Scholar, the number 
of papers including the 

term “crowdsourcing?” has grown 

from less than 1,000 papers per year 

pre-2008 to over 20,000 papers in 2016 

(see the accompanying figure). 

Crowdsourcing, including crowd- 
sourced research, is not always conduct- 
ed responsibly. Typically this results not 
from malice but from misunderstand- 
ing or desire to use funding efficiently. 

Crowdsourcing platforms are complex; 

clients may not fully understand how 

they work. Workers’ relationships to 
crowdwork are diverse—as are their ex- 
pectations about appropriate client be- 
havior. Clients may be unaware of 
these expectations. Some platforms 
prime clients to expect cheap, “fric- 
tionless” completion of work without 
oversight, as if the platform were not an 
interface to human workers but a vast 
computer without living expenses. But 
researchers have learned that workers 
are happier and produce better work 
when clients pay well, respond to work- 
er inquiries, and communicate with 
workers to improve task designs and 
quality control processes. Workers 
have varied but undervalued or unrec- 
ognized expertise and skills. Workers 
on Amazon’s Mechanical Turk platform 

(“MTurk”), for example, are more edu- 

cated than the average U.S. worker.’ 


Papers in Google Scholar that use the term “crowdsourcing.” 
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Many advise clients on task design 
through worker forums. Workers’ skills 
offer researchers an opportunity to shift 
perspective, treating workers not as in- 
terchangeable subjects but as sources 
of insight that can lead to better re- 
search. When clients do not understand 
that crowdsourcing work, including re- 
search, involves interacting through a 
complex, error-prone system with hu- 
man workers with diverse needs, expec- 
tations, and skills, they may uninten- 
tionally underpay or mistreat workers. 
On MTurk, for example, clients may 
refuse to pay for (“reject”) completed 
work for any reason. Rejection exists to 
prevent workers from cheating—for ex- 
ample, completing a survey with ran- 
dom answers. But rejection also has a 
secondary usage: the percentage of 


MARCH 2018 


VOL. 61 


2008 2010 2012 2014 2016 


tasks a worker has had “approved”— 
that is, the percentage of tasks their cli- 
ents chose to pay for—is interpreted as 
a proxy for worker quality, and used to 
automatically screen workers for tasks. 
A worker’s “approval rate,” however, 
can be negatively affected by client er- 
rors in quality control, compromising 
workers’ eligibility for other tasks. 
MTurk offers workers no way to contest 
rejections and no information about a 
client’s rejection history. Clients can 
screen workers based on a form of 
“reputation,” but not the reverse. 
These dynamics seem especially 
relevant for workers who rely on 
crowdwork as a primary or significant 
secondary source of income. While 
some readers may be surprised to 
hear that people earn a living through 
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crowdwork, research shows this is 
increasingly common, even in rich 
countries. In a 2015 International La- 
bour Organization survey of MTurk 
workers (573 U.S. respondents), 38% 
of U.S. respondents said crowdwork 
was their primary source of income, 
with 40% of these (15% of U.S. respon- 
dents) reporting crowdwork as their 
only source of income.’ In a 2016 Pew 
survey of 3,370 MTurk workers, 25% of 
U.S. respondents said that MTurk spe- 
cifically was the source of “all or most” 
of their income." 

While it is to our knowledge gener- 
ally not possible to be certain how rep- 
resentative any survey of crowdworkers 
is, these findings are consistent with 
both other MTurk-specific research 
and recent national surveys of online 
labor platform activity broadly—which 
includes “microtasking” platforms 
(such as MTurk), platforms for in-per- 
son work (such as Uber), and platforms 
for remote work (such as Upwork). For 
example, Farrell and Greig’ found that 
overall the “platform economy was a 
secondary source of income,” but that 
“as of September 2015, labor platform 
income represented more than 75% of 
total income for 25% of active [labor 
platform] participants,” or approxi- 
mately 250,000 workers.* 

With crowdwork playing an eco- 
nomically important role in the lives of 
hundreds of thousands—or millions— 
of people worldwide, we ask: What are 
the responsibilities of clients and plat- 
form operators? 

Crowdsourcing is currently largely 
“outside the purview of labor laws”*— 
but only because most platforms classify 
workers as “independent contractors,” 
not employees. “Employees” in the U.S. 
are entitled to the protections of the Fair 
Labor Standards Act—minimum wage 


a Farrell and Greig’ report that 0.4% of adults 
“actively participate in” (receive income from) 
labor platforms each month. (“Labor plat- 
forms” here include both platforms for in-per- 
son work such as Uber as well as platforms for 
remote work such as MTurk and Upwork.) Per 
the CIA World Factbook, the U.S. total popula- 
tion is 321,369,000, with approximately 80.1% 
“adult” (“15 years or older”). Therefore the 
number of U.S. adults earning more than 75% 
of their income from labor platforms is ap- 
proximately 0.25 * 0.004 * 0.801 * 321369000, 
or 257,415. “Adults” is interpreted by Farrell 
and Greig as “18 years or older,” not “15 years 
or older,” so we round down to 250,000. 
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and overtime pay—but contractors are 
not. (Many countries have similar dis- 
tinctions.) While this legal classification 
is unclear and contested, and there is 
growing recognition that at least some 
crowdworkers should receive many or 
all protections afforded employees (in- 
cluding Salehi et al.,? Michelucci and 
Dickinson,’ and Berg’), these intentions 
have not yet been realized. 

Our own research, which we have 
asked researchers to stop citing’? and 
will therefore not cite here, has been 
used to justify underpayment of 
workers. Reporting on MTurk demo- 
graphics in 2008-2009, we reported 
that workers responding to our survey 
earned on average less than $2/hour. 
This figure has been cited by research- 
ers to justify payment of similar wages. 

Our (now outdated) descriptive re- 
search, which reported averages froma 
sizable but not necessarily representa- 
tive sample of MTurk workers, was not 
an endorsement of that wage. Addi- 
tionally, eight years have passed since 
that study—it should not be used to 
orient current practice. 

Therefore, we build on a long-run- 
ning conversation in computing re- 
search on ethical treatment of crowd- 
workers (for example, Bederson and 
Quinn’) by offering the following high- 
level guidelines for the treatment of 
paid crowdworkers in research. 

Pay workers at least minimum wage 
at your location. Money is the primary 
motivation for most crowdworkers 
(see, for example, Litman et al.° for 
MTurk). Most crowdworkers thus re- 
late to paid crowdwork primarily as 
work, rather than as entertainment ora 
hobby; indeed, as noted previously, a 
significant minority rely on crowdwork 
as a primary income source. Most de- 
veloped economies have set minimum 
wages for paid work; however, the com- 
mon requirement (noted earlier) that 
workers agree to be classified as inde- 
pendent contractors allows workers to 
be denied the protections afforded em- 
ployees, including minimum wage. 

Ethical conduct with respect to re- 
search subjects often requires re- 
searchers to protect subjects beyond 
the bare minimum required by law; giv- 
en the importance of money as a motiva- 
tion for most crowdworkers, it is ethi- 
cally appropriate to pay crowdworkers 
minimum wage. Further, workers have 


EE 
To make 
crowdsourced 
research possible, 
researchers and 

IRBs must develop 
ongoing, respectful 
dialogue with 
crowdworkers. 


requested this (Salehi et al.°). Ethics de- 
mands we take worker requests seriously. 

While crowdworkers are often locat- 
ed around the world, minimum wage 
at the client’s location is a defensible 
lower limit on payment. If workers are 
underpaid, for example, due to under- 
estimation of how long a task might 
take, correct the problem (for instance, 
on MTurk, with bonuses). On MTurk, if 
workers are refused payment mistak- 
enly, reverse the rejections to prevent 
damage to the workers’ approval rat- 
ing. Note that fair wages lead to higher 
quality crowdsourced research.® 

Remember you are interacting with 
human beings, some of whom com- 
plete these tasks for a living. Treat them 
at least as well as you would treat an in- 
person co-worker. As workers them- 
selves have gone to great lengths to ex- 
press to the public,* crowdworkers are 
not interchangeable parts of a vast 
computing system, but rather human 
beings who must pay rent, buy food, 
and put children through school—and 
who have, just like clients, career and 
life goals and the desire to be acknowl- 
edged, valued, and treated with respect. 

Respond quickly, clearly, concisely, 
and respectfully to worker questions 
and feedback via both email and work- 
er forums (for example, turkernation. 
com, mturkcrowd.com). In addition to 
being a reasonable way to engage with 
human workers, this engagement may 
also improve the quality of the work 
you receive, since you may be informed 
of task design problems before a great 
deal of work has been done—and be- 
fore you have incurred a responsibility 
to pay for that work, which was done in 
good faith. 


Learn from workers. If workers tell 
you about technical problems or un- 
clear instructions, address them 
promptly, developing workarounds as 
needed for workers who have complet- 
ed the problematic task. Especially if 
you are new to crowdsourcing, you 
may unknowingly be committing er- 
rors or behaving inappropriately due 
to your study design or mode of en- 
gagement. Many workers have been 
active for years, and provide excellent 
advice. Workers communicate with 
one another and with clients in forums 
(as described earlier); MTurk workers 
in particular have articulated best 
practices for ethical research in the Dy- 
namo Guidelines for Academic Re- 
questers (guidelines.wearedynamo. 
org; Salehi et al.°). 
~ Currently, the design of major 
crowdsourcing platforms makes it dif- 
ficult to follow these guidelines. Con- 
sider a researcher who posts a task to 
MTurk, and after the task is posted, 
discovers that even expert workers 
take twice as long as expected. This is 
unsurprising; recent research shows 
that task instructions are often un- 
clear to workers. If this researcher 
wishes to pay workers “after-the-fact” 
bonuses to ensure they are paid the in- 
tended wage, this can only be done 
one-by-one or with command-line 
tools. The former is time-consuming 
and tedious; the latter is only usable 
for a relative minority of clients. The 
platform’s affordances (or non-affor- 
dances) are powerful determiners of 
how clients (are able to) treat workers. 
We suggest platform operators would 
do workers, clients, and themselves a 
service by making it easier for clients 
to treat workers well in these cases. 

Finally, we call on university Institu- 
tional Review Boards to turn their atten- 
tion to the question of responsible 
crowdsourced research. Crowdworkers 
relate to their participation in crowd- 
sourced research primarily as workers. 
Thus the relation between researchers 
and crowdworkers is markedly different 
than researchers’ relation to study par- 
ticipants from other “pools.” While 
there may be some exceptions, we thus 
believe researchers should generally pay 
crowdworkers at least minimum wage. 
We urge IRBs to consider this position. 

These suggestions are a start, not a 
comprehensive checklist. To make 
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crowdsourced research responsi- 
ble, researchers and IRBs must de- 
velop ongoing, respectful dialogue 
with crowdworkers. 


Further Reading 

For detailed treatment of ethical issues 
in crowdwork, see Martin et al.’ For al- 
ternatives to MTurk, see Vakharia and 
Lease" or type “mturk alternatives” 
into any search engine. Readers inter- 
ested in ethical design of labor plat- 
forms should seek recent discussions 
on “platform cooperativism” (for ex- 
ample, platformcoop.net). 
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Viewpoint 


Computational Social 
Science # Computer Science 
+ Social Data 


The important intersection of computer science and social science. 


HIS VIEWPOINT Is about differ- 
ences between computer sci- 
ence and social science, and 
their implications for compu- 
tational social science. Spoiler 
alert: The punchline is simple. Despite 
all the hype, machine learning is not a be- 
all and end-all solution. We still need so- 
cial scientists if we are going to use ma- 
chine learning to studysocial phenomena 
in a responsible and ethical manner. 
Iam a machine learning researcher 
by training. That said, my recent work 
has been pretty far from traditional 
machine learning. Instead, my focus 
has been on computational social sci- 
ence—the study of social phenomena 
using digitized information and com- 
putational and statistical methods. 
For example, imagine you want to 
know how much activity on websites 
such as Amazon or Netflix is caused by 
recommendations versus other fac- 
tors. To answer this question, you 
might develop a statistical model for 
estimating causal effects from observa- 
tional data such as the numbers of rec- 
ommendation-based visits and num- 
bers of total visits to individual product 
or movie pages over time.’ 
Alternatively, imagine you are inter- 
ested in explaining when and why sen- 
ators’ voting patterns on particular is- 
sues deviate from what would be 
expected from their party affiliations 
and ideologies. To answer this ques- 
tion, you might model a set of issue- 
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based adjustments to each senator’s 
ideological position using their con- 
gressional voting history and the corre- 
sponding bill text.** 

Finally, imagine you want to study 
the faculty hiring system in the U.S. to 
determine whether there is evidence of 
a hierarchy reflective of systematic so- 
cial inequality. Here, you might model 
the dynamics of hiring relationships 
between universities over time using 
the placements of thousands of tenure- 
track faculty.’ 

Unsurprisingly, tackling these kinds 
of questions requires an interdisciplin- 
ary approach—and, indeed, computa- 
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tional social science sits at the inter- 
section of computer science, statistics, 
and social science. 

For me, shifting away from tradi- 
tional machine learning and into this 
interdisciplinary space has meant that 
I have needed to think outside the algo- 
rithmic black boxes often associated 
with machine learning, focusing in- 
stead on the opportunities and chal- 
lenges involved in developing and us- 
ing machine learning methods to 
analyze real-world data about society. 

This Viewpoint constitutes a reflec- 
tion on these opportunities and chal- 
lenges. I structure my discussion here 


IMAGE BY EVANNOVOSTRO 


around three points—goals, models, 
and data—before explaining how ma- 
chine learning for social science there- 
fore differs from machine learning for 
other applications. 


Goals 
When I first started working in compu- 
tational social science, I kept overhear- 
ing conversations between computer 
scientists and social scientists that in- 
volved sentences like, “I don’t get it— 
how is that even research?” And I could 
not understand why. But then I found 
this quote by Gary King and Dan Hop- 
kins—two political scientists—that, I 
think, really captures the heart of this 
disconnect: “[C]omputer scientists may 
be interested in finding the needle in 
the haystack—such as [...] the right Web 
page to display from a search—but so- 
cial scientists are more commonly inter- 
ested in characterizing the haystack.”® 
In other words, the conversations I 
kept overhearing were occurring be- 
cause the goals typically pursued by 
computer scientists and social scientists 
fall into two very different categories. 
The first category is prediction. Predic- 
tion is all about using observed data to 
reason about missing information or fu- 
ture, yet-to-be-observed data. To use King 
and Hopkins’ terminology, these are 
“finding the needle” tasks. In general, it 
is computer scientists and decision mak- 
ers who are most interested in them. Sure 
enough, machine learning has traditional- 
ly focused on prediction tasks—such as 
classifying images, recognizing handwrit- 
ing, and playing games like chess and Go. 
The second category is explanation. 
Here the focus is on “why” or “how” ques- 
tions—in other words, finding plausible 
explanations for observed data. These ex- 
planations can then be compared with 
established theories or previous findings, 
or used to generate new theories. Expla- 
nation tasks are therefore “characteriz- 
ing the haystack” tasks and, in general, it 
is social scientists who are most inter- 
ested in them. As a result, social scien- 
tists are trained to construct careful re- 
search questions with clear, testable 
hypotheses. For example, are women 
consistently excluded from long-term 
strategic planning in the workplace? 
Are government organizations more 
likely to comply with a public records 
request if they know that their peer or- 
ganizations have already complied? 


a 
The goals typically 
pursued by 

computer scientists 
and social scientists 
fall into two very 
different categories. 


Models 

These different goals—prediction and 
explanation—lead to very different 
modeling approaches. In many predic- 
tion tasks, causality plays no role. The 
emphasis is firmly on predictive ac- 
curacy. In other words, we do not care 
why a model makes good predictions; 
we just care that it does. As a result, 
models for prediction seldom need to 
be interpretable. This means that there 
are few constraints on their structure. 
They can be arbitrarily complex black 
boxes that require large amounts of 
data to train. For example, GoogLeNet, 
a “deep” neural network, uses 22 layers 
with millions of parameters to classify 
images into 1,000 distinct categories." 

In contrast, explanation tasks are 
fundamentally concerned with causal- 
ity. Here, the goal is to use observed 
data to provide evidence in support or 
opposition of causal explanations. Asa 
result, models for explanation must be 
interpretable. Their structure must be 
easily linked back to the explanation 
of interest and grounded in existing 
theoretical knowledge about the 
world. Many social scientists therefore 
use models that draw on ideas from 
Bayesian statistics—a natural way to 
express prior beliefs, represent uncer- 
tainty, and make modeling assump- 
tions explicit.’ 

To put it differently, models for pre- 
diction are often intended to replace hu- 
man interpretation or reasoning, where- 
as models for explanation are intended 
to inform or guide human reasoning. 


Data 

As well as pursuing different goals, 
computer scientists and social scien- 
tists typically work with different types 
of data. Computer scientists usually 
work with large-scale, digitized data- 
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sets, often collected and made avail- 
able for no particular purpose other 
than “machine learning research.” In 
contrast, social scientists often use 
data collected or curated in order to 
answer specific questions. Because 
this process is extremely labor inten- 
sive, these datasets have traditionally 
been small scale. 

But—and this is one of the driving 
forces behind computational social sci- 
ence—thanks to the Internet, we now 
have all kinds of opportunities to ob- 
tain large-scale, digitized datasets that 
document a variety of social phenome- 
na, many of which we had no way of 
studying previously. For example, my 
collaborator Bruce Desmarais and I 
wanted to conduct a data-driven study 
of local government communication 
networks, focusing on how political ac- 
tors at the local level communicate with 
one another and with the general pub- 
lic. It turns out that most U.S. states 
have sunshine laws that mimic the fed- 
eral Freedom of Information Act. These 
laws require local governments to ar- 
chive textual records—including, in 
many states, email—and disclose them 
to the public upon request. 

Desmarais and I therefore issued 
public records requests to the 100 
county governments in North Carolina, 
requesting all non-private email mes- 
sages sent and received by each coun- 
ty’s department managers during a ran- 
domly selected three-month time 
frame. Out of curiosity, we also decided 
to use the process of requesting these 
email messages as an opportunity to 
conduct a randomized field experiment 
to test whether county governments are 
more likely to fulfill a public records re- 
quest when they are aware that their 
peer governments have already fulfilled 
the same request. 

On average, we found that counties 
who were informed that their peers 
had already complied took fewer days 
to acknowledge our request and were 
more likely to actually fulfill it. And we 
ended up with over half a million 
email messages from 25 different 
county governments.’ 


Challenges 

Clearly, new opportunities like this 
are great. But these kinds of opportu- 
nities also raise new challenges. Most 
conspicuously, it is very tempting to 
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say, “Why not use these large-scale, 
social datasets in combination with 
the powerful predictive models devel- 
oped by computer scientists?” How- 
ever, unlike the datasets tradition- 
ally used by computer scientists, these 
new datasets are often about people 
going about their everyday lives—their 
attributes, their actions, and their in- 
teractions. Not only do these datasets 
document social phenomena on a 
massive scale, they often do so at the 
granularity of individual people and 
their second-to-second behavior. As 
a result, they raise some complicated 
ethical questions regarding privacy, 
fairness, and accountability. 

It is clear from the media that one of 
the things that terrifies people the most 
about machine learning is the use of 
black-box predictive models in social 
contexts, where it is possible to do more 
harm than good. There is a great deal of 
concern—and rightly so—that these 
models will reinforce existing structur- 
al biases and marginalize historically 
disadvantaged populations. 

In addition, when datapoints are 
humans, error analysis takes on a 
whole new level of importance because 
errors have real-world consequences 
that involve people’s lives. It is not 
enough for a model to be 95% accu- 
rate—we need to know who is affected 
when there is a mistake, and in what 
way. For example, there is a substantial 
difference between a model that is 95% 
accurate because of noise and one that 
is 95% accurate because it performs 
perfectly for white men, but achieves 
only 50% accuracy when making pre- 
dictions about women and minorities. 
Even with large datasets, there is al- 
ways proportionally less data available 
about minorities, and statistical pat- 
terns that hold for the majority may be 
invalid for a given minority group. As a 
result, the usual machine learning ob- 
jective of “good performance on aver- 
age,” may be detrimental to those in a 
minority group.’” 

Thus, when we use machine learn- 
ing to reason about social phenome- 
na—and especially when we do so to 
draw actionable conclusions—we 
have to be exceptionally careful. 
More so than when we use machine 
learning in other contexts. But here is 
the thing: these ethical challenges are 
not entirely new. Sure, they may be 
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new to most computer scientists, but 
they are not new to social scientists. 


Conclusion 

To me, then, this highlights an impor- 
tant path forward. Clearly, machine 
learning is incredibly useful—and, in 
particular, machine learning is useful 
for social science. But we must treat 
machine learning for social science 
very differently from the way we treat 
machine learning for, say, handwriting 
recognition or playing chess. We can- 
not just apply machine learning meth- 
ods in a black-box fashion, as if com- 
putational social science were simply 
computer science plus social data. We 
need transparency. We need to priori- 
tize interpretability—even in predictive 
contexts. We need to conduct rigorous, 
detailed error analyses. We need to 
represent uncertainty. But, most im- 
portantly, we need to work with social 
scientists in order to understand the 
ethical implications and consequences 
of our modeling decisions. 
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BEYOND ITS ROLE as a protocol for managing and 
transferring money, the Bitcoin protocol creates a 
complex system of economic incentives that govern its 
inner workings. These incentives strongly impact the 
protocol’s capabilities and security guarantees, and the 
path of its future development. This article explores 
these economic undercurrents, their strengths and 
flaws, and how they influence the protocol. 

Bitcoin, which continues to enjoy growing popularity, 
is built upon an open peer-to-peer (P2P) network of 
nodes.’ The Bitcoin system is “permissionless”—anyone 
can choose to join the network, transfer money, and 
even participate in the authorization of transactions. Key 
to Bitcoin’s security is its resilience to manipulations 
by attackers who may choose to join the system under 
multiple false identities. After all, anyone can download 
the open-source code for a Bitcoin node and add as 
many computers to this network as they like, without 
having to identify themselves to others. To counter 
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this, the protocol requires nodes that 
participate in the system to show proof 
that they exerted computational effort 
to solve hard cryptographic puzzles 
(proof-of-work) in order to participate 
actively in the protocol. 

Nodes that engage in such work are 
called miners. The system rewards min- 
ers with bitcoins for generating proof- 
of-work, and thus sets the incentives 
for such investment of efforts. 

The first and most obvious effect of 
participants getting paid in bitcoins for 
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running software on their computers 
was that, once bitcoins had sufficient 
value, people started mining quite 
a lot. In fact, efforts to mine intensi- 
fied to such a degree that most min- 
ing quickly transitioned to dedicated 
computer farms that used specialized 
gear for this purpose: first, GPUs that 
were used to massively parallelize 
the work; and later, custom-designed 
chips, or ASICs (application-specific 
integrated circuits), tailored for the 
specific computation at the core of 


the protocol (machines with current 
ASICs are about a million times fast- 
er than regular PCs when perform- 
ing this work). The Bitcoin network 
quickly grew and became more se- 
cure, and competition for the pay- 
ments given out periodically by the 
protocol became fierce. 

Before discussing the interplay be- 
tween Bitcoin’s security and its eco- 
nomics, let’s quickly look at the rules 
of the protocol itself; these give birth to 
this complex interplay. 
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A Quick Primer on 

the Bitcoin Protocol 

Users who hold bitcoins and wish to 
transfer them send transaction mes- 
sages (via software installed on their 
computer or smartphone) to one of the 
nodes on the Bitcoin network. Active 
nodes collect such transactions from 
users and spread them out to their 
peers in the network, each node inform- 
ing other nodes it is connected to about 
the requested transfer. Transactions are 
then aggregated in batches called blocks. 
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Blocks, in turn, are chained together 
to create the blockchain, a record of all 
accepted bitcoin transactions. Each 
block in the chain references its pre- 
decessor block by including a crypto- 
graphic hash of that block—effectively 
a unique identifier of that predecessor. 
A complete copy of the blockchain is 
kept at every node in the Bitcoin net- 
work. The process of block creation 
is called mining. One of its outcomes 
(among others) is the printing of fresh 
coins, which we call minting. 

The rules of the protocol make 
block creation extremely difficult; a 
block is considered legal only if it con- 
tains the answer toa hard cryptograph- 
ic puzzle. As compensation, whenever 
miners manage to create blocks they 
are rewarded with bitcoins. Their re- 
ward is partly made up of newly mint- 
ed bitcoins and partly of mining fees 
collected from all of the transactions 
embedded in their blocks. The rate of 
minting is currently 12.5 bitcoins per 
block. This amount is halved approxi- 
mately every four years. As this amount 
decreases, Bitcoin begins to rely more 
and more on transaction fees to pay 
the miners. 

The key to Bitcoin’s operation is to 
get all nodes to agree on the contents 
of the blockchain, which serves as the 
record of all transfers in the system. 
Blocks are thus propagated quickly 
to all nodes in the network. Still, it 
is sometimes possible for nodes to 
receive two different versions of the 
blockchain. For example, if two nodes 
manage to create a block at the same 
time, they may hold two different ex- 
tensions to the blockchain. These 
blocks might contain different sets of 
payments, and so a decision must be 
made on which version to accept. 

The Bitcoin protocol dictates that 
nodes accept only the longest chain as 
the correct version of events, as shown 
in the accompanying figure. (To be 
more precise, nodes select the chain 
that contains the most accumulated 
computational work. This is usually the 
longest chain.) This rule, often called 
the “longest chain rule,” provides Bit- 
coin with its security. An attacker who 
wishes to dupe nodes into believing 
that a different set of payments has 
occurred will need to produce a lon- 
ger chain than that of the rest of the 
network—a task that is incredibly dif- 
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ficult because of the proof-of-work 
required for each block’s creation. In 
fact, as long as the attacker has less 
computational power than the entire 
Bitcoin network put together, blocks 
and transactions in the blockchain be- 
come increasingly harder to replace as 
the chain above them grows. 

This difficulty in replacing the chain 
implies that it takes many attempts be- 
fore an attacker can succeed in doing 
so. These failed attempts supposedly 
impose a cost on attackers—mining 
blocks off of the longest chain without 
getting the associated mining rewards. 
Naive attacks are indeed costly for at- 
tackers (more sophisticated attacks are 
discussed later). 

The accompanying figure on page 49 
shows the evolution of the blockchain: 
forks appear and are resolved as one of 
the branches becomes longer than 
the other. Blocks that are off the lon- 
gest chain are eventually aban- 
doned. They are no longer extended, 
their contents (transactions colored 
in red) are ignored, and the miners 
that created them receive no reward. 
At point 1 there are two alternative 
chains resulting from the creation of 
a block that did not reference the latest 
tip of the blockchain. At point 2 the 
fork is resolved, as one chain is longer 
than the other. At point 3 there is 
another fork that lasted longer, and 
at point 4 the second fork is resolved. 


Bitcoin Economics 101: 

Difficulty Adjustment and 

the Economic Equilibrium 

of Mining 

Bitcoin’s rate of block creation is kept 
roughly constant by the protocol: 
Blocks are created at random intervals 
of roughly 10 minutes in expectation. 
The difficulty of the proof-of-work re- 
quired to generate blocks increases 
automatically if blocks are created too 
quickly. This mechanism has been put 
in place to ensure that blocks do not 
flood nodes as more computational 
power is added to the system. The sys- 
tem thus provides payments to miners 
at a relatively constant rate, regardless 
of the amount of computational power 
invested in mining. 

Clearly, as the value (in U.S. dollars) 
of bitcoins rises, the mining business 
(which yields payments that are de- 
nominated in bitcoins) becomes more 


lucrative. More participants then find 
it profitable to join the group of min- 
ers, and, as a consequence, the difficul- 
ty of block creation increases. With this 
increase in difficulty, mining blocks 
slowly becomes more expensive. In the 
ideal case, the system reaches equilib- 
rium when the cost of block creation 
equals the amount of extracted re- 
wards. In fact, mining will always be 
slightly profitable—mining is risky, 
and also requires an initial investment 
in equipment, and some surplus in 
the rewards must compensate for this. 
Hence, Bitcoin’s security effectively ad- 
justs itself to match its value: A higher 
value also implies higher security for 
the protocol. 

As mining rewards continue to 
decline (as per the protocol’s min- 
ing schedule), the incentive to cre- 
ate blocks is expected to rely more on 
transaction fees. If a sudden drop in 
bitcoin transaction volume occurs, 
these fees might be insufficient to 
compensate miners for their compu- 
tational resources. Some miners might 
then halt their block creation process, 
temporarily. This may compromise the 
system, as the security of transactions 
depends on all honest miners actively 
participating. (For additional work on 
the incentives in Bitcoin after mining 
declines, see Carlsten et al.*) 

Many complain that the computa- 
tion required to create blocks wastes 
resources (especially electricity) and 
has no economic goal other than im- 
posing large costs on would-be attack- 
ers of the system. The proof-of-work is 
indeed a solution to a useless crypto- 
graphic puzzle—except, of course, that 
this “useless” work secures the Bitcoin 
network. But what if some of the work 
could be useful? Or could be produced 
more efficiently? If mining does not 
entail a waste of resources for each 
node, then it also costs nothing for at- 
tackers to attack the system. In fact, if 
the proof-of-work is less costly to solve, 
more honest participants join min- 
ing (to collect the rewards), and soon 
the difficulty adjustment mechanism 
raises the difficulty again. Hence, in 
a sense, the Bitcoin proof-of-work is 
built to spend a certain amount of re- 
sources no matter how efficient an 
individual miner becomes. To derive 
substantial benefits from mining with- 
out an offsetting increase in costs re- 


quires a proof-of-work that is useful to 
society at large but cannot provide val- 
ue to the individual miner. (For some 
attempts at using other problems as a 
basis for proof-of-work, see Ball et al., 
Miller et al.,° and Zhang et al.*) 


Mining Decentralization 

The key aspect of the Bitcoin protocol 
is its decentralization: no single entity 
has a priori more authority or control 
over the system than others. This pro- 
motes both the resilience of the sys- 
tem, which does not have a single an- 
chor of trust or single point of failure, 
and competition among the different 
participants for mining fees. 

To maintain this decentralization, 
it is important that mining activity in 
Bitcoin be done by many small entities 
and that no single miner significantly 
outweigh the others. Ideally, the re- 
wards that are given to miners should 
reflect the amount of effort they put in: 
a miner who contributes an a-fraction 
of the computational resources should 
create an a-fraction of the blocks on av- 
erage, and as a consequence extract a 
proportional o-fraction of all allocated 
fees and block rewards. 

In practice, some participants can 
benefit disproportionately from min- 
ing, for several different reasons. An 
unbalanced reward allocation of this 
sort creates a bias in favor of larger 
miners with more computational 
power, making them more profitable 
than their smaller counterparts and 
creating a constant economic under- 
current toward the centralization of 
the system. Even slight advantages 
can endanger the system, as the miner 
can use additional returns to purchase 
more and more computational power, 
raising the difficulty of mining as the 


practice 


miner grows and pushing the other 
smaller (and, hence, less profitable) 
miners out of the game. The resulting 
winner-takes-all dynamic inevitably 
leads to centralization within the sys- 
tem, which is then at the mercy of the 
prevailing miner, and no security prop- 
erties can be guaranteed. 

ASICs mining. The appearance of 
ASICs initially alarmed the Bitcoin 
community. ASICs were orders of 
magnitude more efficient at mining 
bitcoins than previous systems. As 
this special hardware was not initially 
easy to acquire, it provided its own- 
ers with a great advantage over other 
miners—they could mine at a much 
lower cost. Those with this advantage 
would add ASIC-based proof-of-work 
to the system until the difficulty level 
would be so high that everyone else 
would quit mining. The risk was then 
that a single large miner would have 
sole access to ASICs and would come 
to dominate the Bitcoin system. Con- 
cerns subsided after some time, as 
ASICs became commercially available 
and more widely distributed. 

In fact, ASIC mining actually intro- 
duces long-term effects that contrib- 
ute to security. Later this article looks 
at how a miner can carry out profitable 
double spending and selfish mining 
attacks. One can argue, however, that 
even selfish and strategic miners are 
better off avoiding such attacks. In- 
deed, a miner who invested millions of 
dollars in mining equipment such as 
ASICs is heavily invested in the future 
value of Bitcoin: the miner’s equip- 
ment is expected to yield payments of 
bitcoins over a long period in the fu- 
ture. Should the miner then use this 
gear to attack the system, confidence 
in the currency would drop, and with 
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it the value of bitcoins and future re- 
wards. The interests of miners are 
thus, in some sense, aligned with the 
overall health of the system. 

All in all, ASIC mining introduces 
a barrier-to-entry to the system, as or- 
dinary people cannot simply join the 
mining efforts; it thus reduces decen- 
tralization. On the other hand, it intro- 
duces a form of barrier-to-exit, as min- 
ers cannot repurpose their equipment 
to other economic activities; it there- 
fore contributes to security. 

The appearance of competing cryp- 
tocurrencies (for example, Litecoin, 
which essentially cloned Bitcoin), 
some of which use the same proof-of- 
work as Bitcoin, offers alternatives for 
miners who wish to divert their mining 
power elsewhere. This introduces com- 
plex market dynamics. For example, 
when a specific currency loses some 
value, miners will divert their mining 
power to another cryptocurrency until 
the difficulty readjusts. This can cause 
fluctuations in block creation that de- 
stabilize smaller cryptocurrencies. 

Alternative systems with no ASIC 
mining. Interestingly, some crypto- 
currencies use different proof-of-work 
puzzles that are thought to be more 
resistant to ASIC mining, that is, they 
choose puzzles for which it is difficult 
to design specialized hardware; for 
example, Ethereum uses the Ethash 
puzzle (https://github.com/ethereum/ 
wiki/wiki/Ethash). This is often 
achieved by designing algorithmic 
problems that require heavy access to 
other resources, such as memory, and 
that can be solved efficiently by com- 
mercially available hardware. 

These alternative systems are in 
principle more decentralized, but on 
the flip side they lack the barrier-to-exit 
effect and its contribution to security. 

A similar effect occurs when cloud 
mining becomes highly available. 
Some mining entities offer their equip- 
ment for rental over the cloud. The cli- 
ents of these businesses are effectively 
miners who do not have a long-term 
stake in the system. As such services 
become cheaper and more accessible, 
anyone can easily become a temporary 
miner, with similar effects on security. 

ASICBoost. Recall that creating a 
block requires solving a cryptograph- 
ic puzzle unique to that block. This 
involves guessing inputs to a crypto- 
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graphic hash function. Solving the 
puzzle is mostly done via brute-force 
enumeration of different inputs. 

Aminercan gain an advantage by cre- 
ating blocks using more efficient meth- 
ods than his or her counterparts. In ad- 
dition to better hardware, an advantage 
can take a more algorithmic form. In 
fact, an algorithmic “trick” nicknamed 
ASICBoost has recently made head- 
lines. ASICBoost enables the miner to 
reuse some of the computational work 
performed during the evaluation of 
one input for the evaluation of another. 
This algorithm is proprietary, patent 
pending, and it is unclear who is and 
who is not using it. Such an algorithmic 
advantage can be translated to lower 
power consumption per hash. Bitmain, 
a large manufacturer of ASICs for bit- 
coin mining that also operates some 
mining pools, was recently accused by 
some of secretly deploying a hardware 
variant of ASICBoost to increase its 
profits. Allegations were made that this 
company was politically blocking some 
protocol improvements that would co- 
incidentally remove their ability to use 
ASICBoost. 

Communication. Yet another meth- 
od for a miner to become more effi- 
cient is to invest in communication 
infrastructure. By propagating blocks 
faster, and by receiving others’ blocks 
faster, a miner can reduce the chances 
that their blocks will not belong to the 
longest chain and will be discarded 
(“orphaned”). As off-chain blocks re- 
ceive no rewards, a better connection 
to the network translates to reduced 
losses. Admittedly, with Bitcoin’s cur- 
rent block creation rate, this advantage 
is rather marginal; blocks are created 
infrequently, and speeding up delivery 
by just a few seconds yields relatively 
little advantage. Nonetheless, better 
connectivity is a relatively cheap way to 
become more profitable. 

Furthermore, the effects of com- 
munication become much more pro- 
nounced when the protocol is scaled 
up and transaction processing is ac- 
celerated. Today, Bitcoin clears three 
to seven transactions per second on 
average. Changing the parameters of 
Bitcoin to process more transactions 
per second would increase the rate of 
orphan blocks and would amplify the 
advantage of well-connected miners. 

Economies of scale. As with any 
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large entity, professional miners may 
enjoy the economic benefits of size. 
With a larger mining operation, such 
miners are much more likely to invest 
in different optimizations, such as 
finding sources of somewhat cheaper 
electricity, or placing their equipment 
in cooler regions to provide more effi- 
cient cooling to their machines (min- 
ing usually consumes a great deal of 
electricity, and cooling the machines 
presents a real challenge). Large min- 
ers can also purchase ASICs in bulk for 
better prices. All of this translates to 
natural advantages to size, a phenome- 
non that is not specific to Bitcoin but in 
fact appears in many industries. These 
effects give large miners an advantage 
and slowly pull the system toward a 
centralized one. 

Many have raised concerns that 
most of today’s Bitcoin mining is done 
by Chinese miners. They enjoy better 
access to ASICs, cheaper electricity, 
and somewhat lower regulation than 
similar operations in other locations. 
The Chinese government, which tight- 
ly controls Internet traffic in and out 
of China, could choose to disrupt the 
system or even seize the mining equip- 
ment that is within its borders. 


Mining Pools and Risk Aversion 
Bitcoin’s mining process yields very 
high reward but with very low prob- 
ability for each small miner. A single 
ASIC that is running full time may 
have less than a 1-in-600,000 chance 
of mining the next block, which im- 
plies that years can go by without find- 
ing a single block. This sort of high- 
risk/high-reward payoff is not suitable 
for most. Many would prefer a small, 
constant rate of income over long pe- 
riods of time (this is essentially risk 
aversion).° A constant income stream 
can be used, for example, to pay the 
electric bills for mining. 

The formation of pools. Mining 
pools are coalitions of miners that 
combine their computational resourc- 
es to create blocks together and share 
the rewards among members of the 
pool. Since the pool’s workers together 
find blocks much more often than each 
miner alone, they are able to provide 
small continuous payments to each 
worker on a more regular basis. 

From the perspective of the Bitcoin 
network, the pool is just a single min- 


ing node. Pool participants interact 
with the pool’s server, which sends 
the next block header that the pool is 
working on to all workers. Each mem- 
ber tries to solve the cryptographic 
puzzle corresponding to this block 
(in fact, they use small variants of the 
same block and work on slightly dif- 
ferent proof-of-work puzzles to avoid 
duplicating work). Whenever a work- 
er finds a solution, it is sent to the 
pool manager, who in turn publishes 
the block to the network. The block 
provides a reward to the pool, which 
the manager then distributes among 
all of the pool’s workers (minus some 
small fee). 

Reward distribution within pools 
and possible manipulations. Many 
pools are public and open to any will- 
ing participant. Obviously, such pools 
must take measures to ensure that 
only members who truly contribute to 
the pool’s mining efforts enjoy a por- 
tion of the rewards. To that end, every 
pool member sends partial solutions 
of the proof-of-work to the pool— 
these are solutions that came “close” 
to being full blocks. Partial solutions 
are much more common than full so- 
lutions, and anyone working on the 
problem can present a steady stream 
of such attempts that fall short of the 
target. This indicates that the worker 
is indeed engaged in work, and can be 
used to assess the amount of compu- 
tational power each worker dedicates 
to the pool. Pools thus reward work- 
ers in some proportion to the num- 
ber of shares that they earn (a share is 
granted for every partial solution that 
is submitted). 

Fortunately, a pool member who 
has found a valid solution to the puzzle 
cannot steal the rewards. The crypto- 
graphic puzzle depends on the block 
header, which is under the control of 
the pool’s manager. It encodes a com- 
mitment to the contents of the block 
itself (via a cryptographic hash), in- 
cluding the recipient of the block’s 
rewards. After finding a valid solution 
for a specific block header, one cannot 
tamper with the header without invali- 
dating the solution. 

Nonetheless, pools are susceptible to 
some manipulations by strategic miners: 

Pool hopping. In the early days of 
Bitcoin, mining pools would simply 
divide the reward from the latest block 
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among all workers in proportion to the 
number of partial solutions each work- 
er submitted. The number of shares 
was measured from the previous block 
created by the same pool. 

Some workers came up with a way 
to improve their rewards: if a pool was 
unlucky and did not find a block for a 
while, many partial solutions (shares) 
would accumulate. If a block was then 
found by the pool, its reward would be 
split among many shares. Working to 
generate additional shares is just as 
costly as before but yields low expected 
rewards for this very reason. Instead, the 
worker could just switch to another pool 
in which a block had been found more 
recently, and in which each additional 
share granted a higher expected reward. 
If many adopt this behavior, a pool that 
is temporarily unsuccessful should, in 
fact, be completely abandoned by all 
rational miners. Pool-hopping-resistant 
reward schemes were quickly developed 
and adopted by most mining pools." 

Block-withholding attacks. While a 
miner cannot steal the block reward 
of a successful solution, he or she can 
still deny the rewards from the rest of 
the miners in the pool. The miner can 
choose to submit only partial solutions 
to the pool’s manager but discard all 
successful solutions. The miner thus 
receives a share of the rewards when 
others find a solution, without provid- 
ing any actual contribution to the pool. 
Discarding the successful solution sab- 
otages the pool, and causes a small loss 
of income to the attacker. 

In spite of the losses to an attack- 
er, in some situations it is worthwhile 
for mining pools to devote some of 
their own mining power to sabotage 
their competitors: the attacker pool 
infiltrates the victim pool by register- 
ing some of its miners as workers in 
the victim pool. These workers then 
execute a block-withholding attack. 
Careful calculations of the costs and 
rewards show that, in some scenar- 
ios (depending on the sizes of the 
attacker and victim pools), the at- 
tack is profitable.* To prevent such 
schemes, a slight modification of the 
mining protocol has been proposed. In 
the modified version, workers would 
not be able to discern between partial 
and full solutions to the proof-of-work 
puzzle and would not be able to selec- 
tively withhold full solutions. 
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Eliminating pools. While pools are 
good for small miners, mitigating 
their risk and uncertainty, they intro- 
duce some centralization to the system. 
The pool operator is essentially con- 
trolling the combined computational 
resources of many miners and is there- 
fore quite powerful. Some researchers 
proposed a technical modification to 
the mining protocol that undermines 
the existence of public pools altogeth- 
er.’ Under this scheme, after finding 
a valid solution to the block, the pool 
member who mined it would still be 
able to redirect the rewards to them- 
selves (without invalidating the solu- 
tion). Assuming many miners would 
claim the rewards for themselves, 
pools would not be profitable and 
would therefore dissolve. 


The Economics of Attacks 

and Deviations from the Rules 
Earlier, this article described methods 
by which a miner can become more 
dominant within the protocol—both to 
profit more than his or her fair share and 
to generate more of the blocks in the 
chain. The methods discussed thus far 
do not violate any of the protocol’s rules; 
in some sense, miners are expected to 
make the most of their hardware and 
infrastructure. This section discusses 
direct violations of the rules of the pro- 
tocol that allow miners to profit at the 
expense of others. In a sense, the ex- 
istence of such strategies implies that 
there is something fundamentally bro- 
ken in the protocol’s incentive struc- 
ture: rational profit-maximizing par- 
ticipants will not follow it. 

Informally, the protocol instructs 
any node to: validate every new mes- 
sage it receives (block/transaction); 
propagate all valid messages to its 
peers; broadcast its own new blocks 
immediately upon creation; and, build 
its new blocks on top of the longest 
chain known to that node. Attacks on 
the protocol correspond to deviations 
from one or more of these instructions. 

Validation. A miner who does not 
validate incoming messages is vulner- 
able—the next block might include an 
invalid transaction that he or she did 
not verify, or reference an invalid pre- 
decessor block. Other nodes will then 
consider this new block as invalid and 
ignore it. This sets a clear incentive for 
miners to embed in their blocks only 
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valid transactions and to validate every 
new block before accepting it. 

Interestingly, despite this logic, 
sometimes miners mine on top of a 
block without fully validating it. This 
practice is known as SPV mining (sim- 
plified payment verification usually 
refers to the use of thin clients that do 
not read the full contents of blocks). 

Why would miners engage in 
building on top of an unvalidated 
block? The answer again lies in in- 
centives. Some miners apply meth- 
ods to learn about the hash ID of a 
newly created block even before re- 
ceiving its entire contents. One such 
method, known as spy-mining, in- 
volves joining another mining pool 
as a worker to detect block creation 
events. Even when the block is re- 
ceived, it takes time to validate the 
transactions it contains. During this 
time, the miner is aware that the 
blockchain is already longer by one 
block. Therefore, rather than letting 
the mining equipment lie idle until 
the block is validated, the miner de- 
cides to mine on top of it, under the 
assumption that it will most likely be 
valid. To avoid the risk that the next 
block will contain conflicts with the 
transactions of the unverified block, 
the miner does not embed new trans- 
actions in the next block, hoping still 
to collect the block reward. 

There is indeed evidence that min- 
ers are taking this approach. First, 
some fraction of the blocks being 
mined is empty (even when many trans- 
actions are waiting to be approved). An- 
other piece of evidence is related to an 
unfortunate incident that took place in 
July 2015. An invalid block was (unin- 
tentionally) mined due toa bug, and SPV 
miners added five additional blocks on 
top of it without validating. Of course, 
other validating miners rejected that 
block and any block that referenced it, 
resulting in a six-block-long fork in the 
network. Blocks that were discarded in 
the fork could have contained double- 
spent transactions. 

This event shows the dangers of SPV 
mining: it lowers the security of Bit- 
coin and may trigger forks in the block- 
chain. Fortunately, miners have vastly 
improved the propagation and valida- 
tion time of blocks, so SPV mining has 
less and less effect. The planned de- 
cline in the minted reward given to 


empty blocks will also lower the incen- 
tive to engage in such behavior. 

Transaction propagation. A second 
important aspect of the Bitcoin pro- 
tocol pertains to information propa- 
gation: new transactions and blocks 
should be sent to all peers in the net- 
work. Here the incentive to comply 
with the protocol is not so clear. Miners 
may even have a disincentive to share 
unconfirmed transactions that have 
yet to be included in blocks, especially 
transactions that offer high fees.’ Min- 
ers have strong incentive to keep such 
transactions to themselves until they 
manage to create a block. Sending a 
transaction to others allows them to 
snatch the reward it offers first. Thus 
far, most transaction fees have been 
relatively low, and there is no evidence 
that transactions with high fees are be- 
ing withheld in this way. 

Next, let’s turn our attention to devia- 
tions from the mining protocol intended 
explicitly to manipulate the blockchain. 

Selfish mining. Whenever a miner 
creates a new block, the protocol says it 
should be created on top of the longest 
chain the miner observes (that is, to 
reference the tip of the longest chain 
as its predecessor) and that the miner 
should send the new block immedi- 
ately to network peers. 

Unfortunately, a miner can benefit 
by deviating from these rules and act- 
ing strategically.’ The miner’s gen- 
eral strategy is to withhold the blocks’ 
publication and keep the extension of 
the public chain secret. Meanwhile, 
the public chain is extended by other 
(honest) nodes. The strategic miner 
publishes the chain only when the risk 
that it will not prevail as the longest 
chain is too high. When the miner does 
so, all nodes adopt the longer exten- 
sion that the miner suddenly released, 
as dictated by the protocol, and they 
discard the previous public extension. 

Importantly, this behavior increas- 
es the miner’s share in the longest 
chain—meaning, it increases the per- 
centage of blocks on the eventual lon- 
gest chain that the miner generates. 
Recall that Bitcoin automatically ad- 
justs the difficulty of the proof-of-work 
so as to keep the block creation rate 
constant. Thus, in the long run, a larg- 
er relative share of blocks in the chain 
translates to an increase in the miner’s 
absolute rewards. 


There is no definite method to verify 
whether miners are engaging in self- 
ish mining or not. Given that very few 
blocks are orphaned, it seems like this 
practice has not been taken up, at least 
not by large miners (who would also 
have the most to gain from it). One way 
to explain this is that miners who at- 
tempt such manipulation over the long 
term may suffer loss to their reputation 
and provoke outrage by the commu- 
nity. Another explanation is that this 
scheme initially requires losing some 
of the selfish miner’s own blocks, and 
it becomes profitable only in the long 
run (it takes around two weeks for the 
protocol to readjust the difficulty level). 

Double spending is the basic attack 
against Bitcoin users: the attacker pub- 
lishes a legitimate payment to the net- 
work, waits for it to be embedded in the 
blockchain and for the victim to confirm 
it, and then publishes a longer chain of 
blocks mined in secret that do not con- 
tain this payment. The payment is then 
no longer part of the longest chain and, 
effectively, “never happened.” 

This attack incurs a risk: the attack- 
er could lose the rewards for his or her 
blocks ifthey donot end up in the longest 
chain. Surprisingly, and unfortunately, 
a persistent attacker can eliminate this 
risk by following more sophisticated at- 
tack schemes.” The idea is to abandon 
the attack frequently, publish the secret 
attack chain, and collect rewards for its 
blocks. By resetting the attack whenever 
the risk of losing block rewards is too 
high, the attacker can eliminate the at- 
tack cost and even be profitable in the 
long term. These schemes are in essence 
a combination of selfish mining and 
double-spending attacks. 

Currently, double spending is not 
observed often in the network. This 
could be because executing a success- 
ful double spend is difficult, or because 
the very miners who could execute 
such attacks successfully also have a 
heavy stake in the system’s reputation. 


Conclusion 

Incentives do indeed play a big role in 
the Bitcoin protocol. They are crucial 
for its security and effectively drive its 
daily operation. As argued here, min- 
ers go to extreme lengths to maximize 
their revenue and often find creative 
ways to do so that are sometimes at 
odds with the protocol. 
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Cryptocurrency protocols should be 
placed on stronger foundations of in- 
centives. There are many areas left to 
improve, ranging from the very basics 
of mining rewards and how they inter- 
act with the consensus mechanism, 
through the rewards in mining pools, 
and all the way to the transaction fee 
market itself. 
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Being funny is serious work. 


| BY THOMAS A. LIMONCELLI 


Operational 
Excellence 

in April Fools’ 
Pranks 


AT 10:23 UTC on April 1, 2015, stackoverflow.com enabled 
an April Fools’ prank called StackEgg.' It was a simple 
Tamagotchi-like game that appeared in the upper-right 
corner of the company’s website. Though it had been 
tested, we did not account for the additional network 
activity it would generate. By 13:14 UTC the activity had 
grown to the point of overloading the company’s load 
balancers, making the site unusable. All of the company’s 
Web properties were affected. The prank had, essentially, 
created a self-inflicted denial-of-service attack. 

The engineers involved in the prank didn’t panic. 
They went to a control panel and disabled the feature. 
Network activity returned to normal, and the site 
was operating again by 13:47 UTC. The problem was 
diagnosed, fixed, and new code was pushed into 
production by 14:56 UTC. The prank was saved! 
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Was Stack Overflow lucky that the 
engineers had designed the prank so 
that it could be easily disabled? No, it 
was not luck. It was all in the playbook 
for operational excellence in AFPs 
(April Fools’ pranks). 

A successful AFP depends on many 
operational best practices. In this ar- 
ticle, I will share some of the key ones. 


What Makes an April 
Fools’ Prank Funny? 
Before discussing the technical details, 
let’s look at what makes an AFP funny. 
The best AFPs are topical and absurdist. 

Topical means it refers to current 
events or trends. This makes it relevant 
and “a thinker.” Topical would be dis- 
playing your website upside-down after 
a large and highly publicized acquisi- 
tion by a major Australian competitor. 
(Australians tell me that kind of joke 
never gets old.) Doing that to your web- 
site otherwise just announces that your 
Web developers finally read that part of 
the CSS3 spec. 

Secondly, it must be so absurd that 
it reveals a hidden truth. Absurdist 
humor is not simply silly for silliness’ 
sake. Absurdism acts as a crucible that 
burns away all lies to get to the truth. 

Stack Overflow’s 2017 prank, “Dance 
Dance Authentication,” was both topi- 
cal and absurdist.’ The prank was a 
blog post and accompanying dem- 
onstration video for Stack Overflow’s 
new (fictional) authentication system. 
Rather than the usual 2FA (two-factor 
authentication) system that requires 
an authenticator app or key fob, this 
system required users to turn on their 
webcams and dance their password. 
This was topical because recent growth 
in 2FA adoption meant many Internet 
users were experiencing 2FA for the 
first time. It was absurdist because it 
took the added burden and nuisance of 
2FA to an extreme. It revealed the truth 
that badly implemented security sacri- 
fices convenience. 

Inspiration for absurdity should 
come from reality. For example, the 
Go programming language is an in- 
tentionally minimalistic language— 


SCREEN CAPTURE FROM STACK OVERFLOW'S “DANCE DANCE AUTHENTICATION" VIDEO BY ARCADIA CREATIVE 


A scene from Stack Overflow’s 2017 AFP: Dance Dance Authenication (https://www.youtube.com/watch?v=VgC4b9K-gYU). 


a reaction against bloated languages 
such as C++ and Java. It seems like 
every C++ or Java programmer who 
learns Go posts to forums demanding 
dozens of features that are “missing.” 
This leads to a discussion about why 
those features are intentionally miss- 
ing from Go. This discussion seems 
to happen on a weekly basis. A good 
AFP for Go would be a blog post an- 
nouncing that Go 2.0 will include all 
those “missing” features and, in fact, 
they have been implemented and are 
ready for use. The article would then 
link to the download page for Java. 


What Makes an April Fools’ 

Prank Un-Funny? 

A prank should not get in the way of busi- 
ness or harm customers. For example, 
a 2016 Gmail prank called “Drop the 
Mic” gave users a button that would 
send a farewell message to someone, 
then block all email from that person 
... forever. There was no “Are you sure?” 
prompt. As you can guess, this disrupt- 
ed actual customers trying to do actual 
business.’ Google disabled the prank a 
few hours later. 


An AFP should not mock a particu- 
lar person (that is just mean) or group 
of people (that is just hateful). The ex- 
ception to this is that it is always OK to 
mock people more powerful than you. 
Punch up, not down. 

> Punch up: Mock elite people who 
don’t realize how privileged they are; 
mock the CEO who bragged he’s sav- 
ing the company money by using his 
private jet. 

> Don’t punch down: Do not mock 
the less fortunate—for example, don’t 
mock homeless people or any group 
of powerless people in society; racist, 
sexist, or homophobic humor is not 
funny because it is inherently punch- 
ing down. 

An AFP should be funny to the au- 
dience, not just the people who creat- 
ed it. Every year plenty of companies 
produce AFPs that fall flat because 
they are inside jokes that everyone in 
the company finds hiii-larious. That 
is all well and good, but if the inten- 
tion was to make customers laugh, 
it really should not depend on them 
knowing that Larry in accounting 
loves World of Warcraft. 


As with any feature, user acceptance 
testing should be done with a wide va- 
riety of users. Be sure to include some 
nonusers. You might consider doing 
user experience testing, but since most 
companies don’t, why start now? 


Engineer It Like Any Other Feature 
The end-to-end process of creating and 
launching the prank should be the same 
as any other feature. It should start with 
a concept, then have a design and execu- 
tion plan, launch plan, and operational 
runbook. Involve product management. 
Have requirements, specifications, a 
project schedule, testing, and so on. If 
it is a big prank, beta testing with users 
sworn to secrecy may be required. 

Like any major feature, the earlier 
you involve operations, the better. Op- 
erations’ worst nightmare is to be told 
that a major feature is being launched 
tomorrow ... “Would you please set up 
10 new servers and find a petabyte of 
disk space?” April Fools’ pranks are 
no different. They often require extra 
bandwidth, isolated servers, firewall 
rules, and other tasks that take days or 
weeks to complete. 
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Feature Flags 

The prank should be easy to enable 
and disable. Hide the feature behind 
a “feature flag.” With the flag off, the 
feature is in the code but dormant. 
Enabling the prank in production is a 
matter of turning the flag on. Disabling 
it is a simple matter of turning the flag 
off. Developers can test the feature by 
enabling the flag in the development 
and test environments. Some flag sys- 
tems can automatically be on for cer- 
tain user segments. 

Some companies can launch or dis- 
able a feature only by rolling out new 
code into production. This is bad for 
many reasons. It is riskier than fea- 
ture flags: if the release that removes 
a prank is broken, do you revert to 
the previous release (with the prank) 
or the prior release (which may be 
too old to deploy into production)? 
Code pushes are difficult to coordi- 
nate with PR, blog posts, and so on: 
they might take minutes or hours, not 
seconds, like flipping a feature flag. 
Code pushes require more skill: in 
many environments, code pushes are 
done by specific people, who might 
be asleep. In an emergency you want 
to empower anyone to shut off the 
prank. The process should be quick 
and easy. Lastly, if the prank has over- 
loaded the network, it may also affect 
the systems that push new code. Mean- 
while, a feature “flag flip” is simpler 
and more likely to just plain work. 

The way you structure an AFP project 
is unusual in that the deadline cannot 
change. There are three levers available 
to managers: deadline, budget, and 
features. If a project is going to be late, 
management must adjust one of those 
three. An AFP, however, cannot adjust 
the deadline and usually has a limited 
budget. Therefore, it is important to 
segment the features of the prank. 
First implement the basic prank, then 
add “would be nice” features. As you 
get closer to the deadline, throw away 
the less important features. When a 
badly structured prank is late, all fea- 
tures will be 80% done, which means 
0% of them can be launched. You blew 
it. When a well-structured prank is late, 
80% of the features are ready to launch, 
and the customers will be no wiser 
about the missing 20%. Structuring 
a project in this way requires skillful 
planning up front. 
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During the prank, plausible deni- 
ability is important. Act like it is real, or 
act like you don’t see it, or act like you 
were not involved. Do, however, include 
a link to a page that explains that this 
is just a joke. They say a joke isn’t fun- 
ny if you have to explain it; if someone 
doesn’t realize it is a joke, it can lead to 
unfunny situations and hurt feelings. 
This is the Internet, not Mensa. 

Perform a project retrospective.” Af- 
ter the prank, sit down with everyone 
involved and reflect on what went well, 
what didn’t go well, what should be 
done the same way next time, and what 
should have been done differently. Pub- 
lish this throughout the organization. It 
not only makes everyone feel included, 
but it also educates people about how to 
do better next time. Yes, you may have 
overloaded the network and created an 
outage, but if everyone in the organiza- 
tion learned from this experience, your 
organization is now smarter. Every out- 
age that results in organizational learn- 
ing is a blessing. If you hide informa- 
tion, the organization stays ignorant. 


Case Study: The Mustache Prank 
One of the most successful AFPs I was 
involved with was at a previous employ- 
er. Managers had been on a teleconfer- 
ence for an hour brainstorming ideas 
for an AFP. They wanted one that would 
be visible only to employees. There is 
nothing less funny than managers try- 
ing to write a joke, so they turned to me. 
I was a half-manager so they assumed 
I'd have a half-funny suggestion. 

After listening to the ideas they had 
so far, I was not impressed. They were 
irrelevant, not topical; silly, not ab- 
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surdist. Obviously, they did not have 
the benefit of reading this article. 

I thought for a moment. What was 
the most recently controversy? Well, 
facial-recognition software was be- 
coming good enough and computa- 
tionally inexpensive enough that it 
was making the news and starting a 
lot of ethical debates. 

Iblurted out, “Hey, didn’t we just pur- 
chase a company that makes facial-rec- 
ognition software? You’d think a smart 
bunch of people like that would be able 
to accurately place mustaches on all the 
photos in the corporate directory.” 

There was a short pause in the con- 
versation. Then one manager said, “We 
just moved those people into my build- 
ing. They sit down the hall from me.” 
Another manager chimed in that he 
manages the team that runs the cor- 
porate directory. Another manages the 
operations people for it. Another man- 
ages the helpdesk most likely to receive 
any complaints. 

Soon, we had a plan. 

We started meeting weekly. We wrote 
a design doc that spelled out how the 
AFP would work, how we would shut it 
off after 24 hours, and, most important- 
ly, how individual people could opt out 
if they complained. A project manager 
was assigned to coordinate people on 
three different continents to make it all 
happen as expected. HR and executive 
management signed off on the project. 

This was long before social media 
apps were doing this kind of thing, so 
the primary question we kept getting 
was, “Is this really possible?” 

Was it technically feasible? Yes. It 
turns out the free software develop- 
ment kit that the company provided 
included a mustache-placement API. 
“Mustaching a person” was the demo 
they used to sell the company. 

By the time April 1 rolled around, a 
new set of photos was prepared and 
ready to be swapped in. The help- 
desk was trained on how to revert 
individual photos. 

The prank was a huge success. Ev- 
eryone thought it was hilarious, except 
for one person who complained and 
opted out. 

Afterward, we wrote up a retrospec- 
tive and thanked everyone involved. 
In such a highly distributed company, 
this was the best way to let everyone in- 
volved “take a bow.” 
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Launch It Like It’s Hot 

If an AFP will have significant resource 
needs, load testing is important. Ev- 
eryone knows how to do load testing: 
simulate thousands of HTTP requests 
and take measurements. Find and fix 
the bottlenecks and repeat until you 
are satisfied. 

You also need to plan for the situ- 
ation where the AFP goes viral and 
receives 10 times or 100 times more 
users than you could ever expect. The 
easy strategy here is simply to plan on 
disabling the AFP, but it would be dis- 
appointing that the reward for success 
was to turn the feature off. 

Fixing such a situation is difficult 
because normal solutions might take 
weeks to implement and April Fools’ 
Day lasts only one day. If you fix a prob- 
lem and relaunch the next day, you 
have missed the boat. 

Facebook is in a similar situation 
when launching real features because 
there is a lot of press around a new 
feature and Facebook needs to “get 
it right” on the first try. When Face- 
book was new, growth was slow and 
bottlenecks could be fixed by sim- 
ply fixing them at the pace Facebook 
was growing. By 2008 Facebook had 
millions of users, and a new feature 
would go from 0 to millions of users 
within hours. There would be no time 
to fix unexpected bottlenecks. A failed 
launch is highly visible and embar- 
rassing, often becoming front-page 
news. There is no way, however, to 
build an isolated system big enough 
to perform load testing. 

To solve this problem, Facebook 
uses a technique called a “dark 
launch:” testing a feature by first 
launching it invisibly. For example, 
Facebook launched Chat six months 
early but made it invisible (CSS display: 
hidden). The HTML and JavaScript 
code was in your browser, but it did 
not display itself. A certain percent- 
age of users received a signal to send 
simulated chat messages through the 
system. The percentage was turned 
up over time so that developers could 
spot and fix any performance issues. By 
the time the feature was made visible 
(and the test messages were disabled), 
Facebook’s engineers were confident 
that the launch would not have perfor- 
mance problems. It is suggested that 
nearly every feature that Facebook will 


launch in the next six months is already 
running in your browser.’ 

Google did something similar be- 
fore launching IPv6 connectivity; your 
browser was running invisible Java- 
Script that tested whether your ISP con- 
nection would fail if IPv6 were enabled. 
Worries were for naught, but the test 
increased confidence before launch. 

Stack Overflow dark launches 
new ad-serving infrastructure. When 
launching major features, we first use 
the system to transmit house ads that 
are invisible to users. Once perfor- 
mance is verified, we make the adver- 
tisements visible. Sadly, we did not use 
this technique when launching Stack- 
Egg, but now we know better. 


Pranks with Minimal 

Operational Impact 

Technical issues can be avoided with 
proper testing, but there is a strat- 
egy that avoids the issue altogether. 
Simply create a prank that has no 
operational impact, or directs the 
impact elsewhere. 

The “Dance Dance Authentication” 
example is one such prank. The prank 
was simply a blog post and a link to a 
YouTube video (https://www.youtube. 
com/watch?v=VeC4b9K-gYU). This does 
not entirely avoid the issue, but if your 
success ends up overloading YouTube’s 
network, at least it is not your problem. 

You can also simply take an exist- 
ing feature and create an alternative 
explanation or history for it. For ex- 
ample, you may have heard of “the 
teddy bear effect.” Many have ob- 
served that often the act of asking 
a question forces you to think out 
enough details to realize the answer 
yourself. In Bell Labs folklore there 
was a researcher known for helping 
people with research roadblocks. 
People would go to him for sugges- 
tions. By listening, they would come 
up with the answer themselves. Once, 
he left on a long vacation and left a 
teddy bear on his desk with a note 
that read, “Explain your problem to 
the bear.” Many people found it was 
equally effective. (Lately, the Internet 
has started calling this “the rubber 
duckie effect.”) 

Suppose you run a question-and- 
answer website: some users post ques- 
tions, and other people post answers. 
Suppose also that the website has a fea- 
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ture that permits people to write up the 
answers to their own questions. A very 
simple but effective AFP would be to 
rename this feature “teddy bear mode” 
and write a blog post claiming this to 
be an entirely new feature, based on 
the power of a teddy bear’s ability to 
help solve technical issues. 


Summary 
Successful AFPs require care and plan- 
ning. Write a design proposal and a 
project plan. Involve operations early. 
If this is a technical change to your 
website, perform load testing, prefer- 
ably including a “dark launch” or hid- 
den launch test. Hide the prank behind 
a feature flag rather than requiring a 
new software release. Perform a retro- 
spective and publish the results widely. 
Remember that some of the best 
AFPs require little or no technical 
changes at all. For example, one could 
simply summarize the best prac- 
tices for launching any new feature 
but write it under the guise of how 
to launch an April Fools’ prank. That 
would be hilarious. 
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Perfect should never 
be the enemy of better. 


| BY THEO SCHLOSSNAGLE 


Monitoring 
in a DevOps 
World 


THE TITLE OF this article might suggest it is about how 
you are supposed to be monitoring systems in an 
organization that is making or has already made the 
transformation into DevOps. Actually, this is an article 
to make you think about how computing has changed 
and how your concept of monitoring perhaps needs 
re-centering before it even applies to the brave new 
world of DevOps. 

The harsh truth is that this is not just a brave, but 
also a fast, new world. One of the primary drivers for 
adopting DevOps is speed—particularly the reduction 
of risk at speed. An organization has to make many 
changes to accommodate this. The DevOps community 
often talks about automation and culture. This makes 
a lot of sense, as automation is where speed comes 
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from and every problem can always be 
rephrased as a people (or communica- 
tions) problem; automation and cul- 
ture are key. 

That said, the ground has shifted un- 
der the monitoring industry. This seis- 
mic change has caused existing tools to 
change and new tools to emerge in the 
monitoring space, but that alone will 
not deliver us into the low-risk world of 
DevOps—not without new and updated 
thinking. Why? Change. 

Monitoring, at its heart, is about ob- 
serving and determining the behavior 
of systems, often with an ulterior motive 
of determining “correctness.” Its pur- 
pose is to answer the ever-present ques- 
tion: Are my systems doing what they are 
supposed to? It’s also worth mention- 
ing that systems is a very generic term, 
and in healthy organizations, systems 
are seen in a far wider scope than just 
computers and computing services; they 
include sales, marketing, and finance, 
alongside other “business units,” so 
the business is seen as the complex 
interdependent system it truly is. That 
is, good monitoring can help people 
take a truly systems view not only of sys- 
tems, but also organizations. 

Long dead are the systems that age 
like fine wine. Today’s systems are 
born in an agile world and remain fluid 
to accommodate changes in both the 
supplier and the consumer landscape. 
A legitimate response to “adapt or die” 
is “I’ll do DevOps!” This highly dynam- 
ic system stands to challenge tradition- 
al monitoring paradigms. 


The Old World 

In a world with “slow” release cycles 
(often between six and 18 months), 
making software operational was an 
interesting challenge. The system de- 
ployed at the beginning of a release 
looked a lot like the same system sev- 
eral months later. It’s not that it was 
stuck in time, but more that it was 
branched into a maintenance-only 
mind-set. With maintenance comes 
bug fixes and even performance en- 
hancements, but not new features, new 
systems components, removal of old 
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systems components, and new features 
or functions that would fundamentally 
change the stress on the architecture. 
Simply put, it is not very fluid. 

For monitoring, this lack of fluidity 
is fantastic. If the system today is the 
system tomorrow and the exercise 
that system does today is largely the 
same tomorrow, then developing a set 
of expectations around how the sys- 
tem should behave becomes quite 
natural. From a more pragmatic point 
of view, the baselines developed by ob- 
serving the behavior of the system’s 
components will very likely live long, 
useful lives. 

This article is not going to dive into 
the risks involved with releasing a dra- 
matic set of code changes infrequently, 
as there are countless stories (anecdot- 
al and otherwise) that state their mag- 
nitude and probabilistic certainty. Suf- 
fice it to say: there be dragons on that 
path. This is one of the many reasons 
that agile, Kanban, and other more re- 


sponsive work processes have been so 
widely adopted. DevOps is the organi- 
zational structure that makes the 
transformation possible. 


The New World 

So, we are all on board with rapid and 
fluid business and development pro- 
cesses, and we have continuous “ev- 
erything” to let us manage risk. The 
world is wonderful, right? Well, “con- 
tinuous monitoring” (in this new sense 
of continuous) doesn’t exist, and, be- 
sides, the name would be pretty dumb; 
shouldn’t all monitoring have always 
been continuous? 

The big problem here is that the 
fundamental principles that power 
monitoring, the very methods that 
judge if your machine is behaving it- 
self, require an understanding of what 
good behavior looks like. Whether you 
are building statistical baselines, using 
formal models, or just winging it, in or- 
der to understand if systems are misbe- 
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having, you need to know what it looks 
like when they are behaving. 

In this new world, you not only have 
fluid development processes that can 
introduce change on a continual basis, 
you also have adopted a microservices- 
systems architecture pattern. Mi- 
croservices simply dictate that the so- 
lution to a specific technical problem 
should be isolated to a network-acces- 
sible service with clearly defined inter- 
faces such that the service has free- 
dom. Many developers like this model, 
as they are given more autonomy in the 
design of the service, extending to 
choice of language, database technol- 
ogy, etc. This freedom is very powerful, 
but its true value lies in decoupling re- 
lease schedules and maintenance, and 
allowing for independent higher-level 
decisions around security, resiliency, 
and compliance. 

This might seem like an odd tan- 
gent, but the conflation of these two 
changes results in something quite 
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unexpected for the world of monitor- 
ing: the system of today neither looks 
like nor should behave like the system 
of tomorrow. 


An Aside On ML And Al 

Many monitoring companies have 
been struggling to keep up with the na- 
ture of ephemeral architecture. Nodes 
come and go, and architectures dy- 
namically resize from one minute to 
the next in an attempt to meet growing 
and shrinking demand. As nodes spin 
up and subsequently disappear, moni- 
toring solutions must accommodate. 
While some old monitoring systems 
struggle with this concept, most mod- 
ern systems take this type of dynamic 
systems sizing in stride. 

The second and largely unmet chal- 
lenge is the dynamic nature not of an 
architecture’s size but rather of its de- 
sign. With microservices-based archi- 
tectures and multiple agile teams con- 
tinually releasing software and 
services, the design of modern archi- 
tecture is constantly in transition. 

A hot topic in monitoring is how to 
apply ML (machine learning) and AI (ar- 
tificial intelligence) to the problems at 
hand, but the current approaches seem 
to be attempting to solve yesterday’s 
problems and not tomorrow’s. AI and 
ML provide an exceptionally rich new 
set of techniques to solve problems and 
will undoubtedly prove instrumental in 
the monitoring world, but the prob- 
lems they must tackle are not that of 
modeling an architecture and learning 
to guide its operations. The architec- 
ture it learns today will have changed by 
tomorrow, and any guidance will be an- 
tiquated. Instead, to make a significant 
impact, AI and ML approaches need to 
take a step back and help guide pro- 
cesses and design. 


Characteristics of 

Successful Monitoring 

It would be cruel to cast a gloomy shad- 
ow on the state of monitoring without 
providing some tactical advice. Luck- 
ily, many people are monitoring their 
systems exceptionally well. Here is 
what they have in common: 

What is more important than how. 
The first thing to remember is that all 
the tools in the world will not help you 
detect bad behavior if you are looking 
at the wrong things. Be wary of tools 
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that come with prescribed monitoring 
for complex assembled systems; rarely 
are systems in the tech industry assem- 
bled and used in the same way at two 
different organizations. The likely sce- 
nario is that the monitoring will seem 
useless, but in some cases it may pro- 
vide a false confidence that the systems 
are functioning well. 

When it comes to monitoring the 
“right thing,” always look at your busi- 
ness from the top down. The technical 
systems the organization operates are 
only provisioned and operated to meet 
some stated business goal. Start by 
monitoring whether that goal is being 
met. A tongue-in-cheek example: al- 
ways monitor the payroll system, be- 
cause if you are not getting paid, what’s 
the point? 

Mathematics: It’s necessary. Sec- 
ond, embrace mathematics. In mod- 
ern times, functionality is table stakes; 
it isn’t enough that the system is work- 
ing, it must be working well. It is a rare 
day when you have an important moni- 
tor that consumes a Boolean value 
“good” or “bad.” Most often, systems 
are being monitored around delivered 
performance, so the consumed values 
(or indicators) are numbers and often 
latencies (a time representing how 
long a specific operation took). You are 
dealing with numbers now, so math is 
required, like it or not. Basic statistics 
are a fundamental requirement for 
both asking and interpreting the an- 
swers to questions about the behavior 
of systems. 

As systems grow and the focus turns 
more to their behavior, data volumes 
rise. In seven years, Circonus has expe- 
rienced an increase in data volume of 
almost seven orders of magnitude. 
Some people still monitor systems by 
taking a measurement from them ev- 
ery minute or so, but more and more 
people are actually observing what 
their systems are doing. This results in 
millions or tens of millions of mea- 
surements per second on standard 
servers. People tend not to solve diffi- 
cult problems unless the answers are 
valuable. Handling 10 million mea- 
surements per second from a single 
server when you might have thousands 
of servers might sound like overkill, but 
people are doing it because the technol- 
ogy exists that makes the cost of finding 
the answers less than the value of those 


answers. People do it because they are 
able to run better, faster systems and 
beat the competition. To handle data at 
that volume, you must also use a capa- 
ble set of tools. To form intelligent 
questions around data at this volume, 
you must embrace mathematics. 

As you might imagine, without a set 
of tools to help you perform fast, accu- 
rate, and appropriate mathematical 
analysis against your observed data, you 
will be at a considerable disadvantage; 
luckily, there are myriad choices from 
Python and R to tools that will help you 
find more comprehensive solutions 
from modern monitoring vendors. 

Data retention. A third important 
characteristic of successful monitor- 
ing systems is data retention. Monitor- 
ing data has often been considered low 
value and high cost and is often ex- 
punged with impudence. Times have 
changed, and, as with all things com- 
puting, the cost of storing data has fallen 
dramatically. More importantly, DevOps 
have changed the value of long-term 
retention of this data. DevOps is a 
culture of learning. When things go 
wrong, and they always do, it is critical 
to have a robust process for interrogat- 
ing the system and the organization to 
understand how the failure transpired. 
This allows processes to be altered to 
reduce future risk. That’s right: learn- 
ing reduces risk. 

At the pace we move, it is undeni- 
able that your organization will devel- 
op intelligent questions regarding a 
failure that were missed immediately 
after past failures. Those new ques- 
tions are crucial to the development of 
your organization, but they become ab- 
solutely precious if you can travel back 
in time and ask those questions about 
past incidents. This is what data reten- 
tion in monitoring buys you. The new 
processes and interrogation methods 
you learn during your postmortems 
leading up to this year’s cyber-Thurs- 
day shopping traffic can now be ap- 
plied to last year’s cyber-Thursday 
shopping traffic. This often leads to 
fascinating and valuable learning that, 
you guessed it, reduces future risk. 

Be articulate about what success 
looks like. The final piece of advice for a 
successful monitoring system is to be 
specific about what success looks like. 
Using a language to articulate what suc- 
cess looks like allows people to win. It is 


wholly disheartening to think you’ve 
done a good job and met expectations, 
and then learn the goalposts have 
moved or that you cannot articulate why 
you’ve been successful. The art of the 
SLI (service-level indicator), SLO (ser- 
vice-level objective), and SLA (service- 
level agreement) reigns here. Almost 
every low-level, ad hoc monitor and ev- 
ery high-level executive KPI (key perfor- 
mance indicator) can be articulated in 
terms of “service level.” Understanding 
the service your business provides and 
the levels at which you aim to deliver 
that service is the heart of monitoring. 

SLIs are things that you have identi- 
fied as directly related to the delivery of 
a service. SLOs are the goals you set for 
the team responsible for a given SLI. 
SLAs are SLOs with consequences, of- 
ten financial. Though a slight oversim- 
plification, think about it like this: 
What is important? What should it 
look like? What should I promise? For 
this, a good understanding of histo- 
grams can help. 


From RUM to RSM 
Monitoring in the Web world moved 
long ago from the slow, synthetic ping 
of a website to recording and analyzing 
every interaction with every user; syn- 
thetic monitoring of the web gave way 
to RUM (real user monitoring) at the 
turn of the century and no one looked 
back. As we build more smaller, decou- 
pled services, we move into a realm of 
being responsible for servicing other 
small systems—these are what engi- 
neering SLOs are usually built around. 
The days of average latency for an 
API request or a database interaction 
or a disk operation (or even a syscall!) 
are disappearing. RSM (real systems 
monitoring) is coming, and we will, 
just as with RUM, be recording and 
analyzing systems-level interactions— 
every one of them. Over the last decade 
increased systems observability (such 
as the widely adopted DTrace and 
Linux’s eBPF) and improvements in 
time-series databases (such as the first- 
class histogram storage in Circonus’s 
IRONdb) have made it possible to de- 
liver RSM. (For a detailed look at why 
histogram storage of data is different 
and, more importantly, relevant, see 
the review by Baron Schwartz, “Why 
Percentiles Don’t Work the Way You 


Think.” https://www.vividcortex.com/ 
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blog/why-percentiles-dont-work-the- 
way-you-think.) 

RSM allows you to look at actual sys- 
tem behavior comprehensively, ac- 
counting for the whole distribution of 
observed performance instead of the 
synthetically induced measurements 
that consistently misrepresent the ex- 
perience of using the system. 

The transition from synthetic web 
monitoring to RUM was seismic; ex- 
pect nothing less from the impending 
transition to RSM. 


Don’t Delay 
Today, with architectures dynamically 
shifting in size by the minute or hour 
and shifting in design by the day or 
the week, we need to step back and re- 
member that monitoring is about un- 
derstanding the behavior of systems, 
and that systems need not be limited to 
computers and software. A business is 
a complex system itself, including de- 
coupled but connected subsystems of 
sales, marketing, engineering, finance, 
and so on. Monitoring can be applied 
to all of these systems to measure im- 
portant indicators and detect changes 
in overall systems behavior. 
Monitoring can seem quite over- 
whelming. The most important thing 
to remember is that perfect should 
never be the enemy of better. DevOps 
enables highly iterative improvement 
within organizations. If you have no 
monitoring, get something; get any- 
thing. Something is better than noth- 
ing, and if you have embraced DevOps, 
you have already signed up for making 
it better over time. 
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As the software industry enters the era 
of language-oriented programming, it needs 
programmable programming languages. 
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A Programmable 
Programming 
Language 


IN THE IDEAL world, software developers would 
analyze each problem in the language of its domain 
and then articulate solutions in matching terms. 
They could thus easily communicate with domain 
experts and separate problem-specific ideas from 
the details of general-purpose languages and specific 
program design decisions. 


In the real world, however, programmers use a ee ee Oe 
. . an emerging software-development 
mainstream programming language someone else paradigm likely to revolutionize the way 
7 S A people build software. 
picked for them. To address this conflict, they resort 
ë è A ä i m It elevates “language” itself to a software 
to—and on occasion build their own—domain-specific building block, with the same status as 


objects, modules, and components. 


languages embedded in the chosen language 
(embedded domain-specific languages, or eDSLs). 


As with other paradigms, language 
orientation thrives when the base 


i o i language supports it directly; the Racket 
For example, JavaScript programmers employ JQuery N i 
for interacting with the Document Object Model and language-oriented programming for 20 
: r years, providing a platform for exploring 
React for dealing with events and con currency. this exciting new development in depth. 
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ILLUSTRATION BY CHRIS LABROOY 


As developers solve their problems 
in appropriate eDSLs, they compose 
these solutions into one system; that 
is, they effectively write multilingual 
software in a common host language.* 

Sadly, multilingual eDSL program- 
ming is done today on an ad hoc basis 


a The numerous language-like libraries in script- 
ing languages (such as JavaScript, Python, and 
Ruby), books (such as Fowler and Parson),”° 
and websites (such as Federico Tomassetti’s, 
https://tomassetti.me/resources-create- 
programming-languages/) are evidence 
of the desire by programmers to use and 
develop eDSLs. 


and is rather cumbersome. To create 
and deploy a language, programmers 
usually must step outside the chosen 
language to set up configuration files 
and run compilation tools and link-in 
the resulting object-code files. Worse, the 
host languages fail to support the proper 
and sound integration of components in 
different eDSLs. Moreover, most avail- 
able integrated development environ- 
ments (IDEs) do not even understand 
eDSLs or perceive the presence of code 
written in eDSLs. 

The goal of the Racket project is 
to explore this emerging idea of lan- 


guage-oriented programming, or 
LOP, at two different levels. At the 
practical level, the goal is to build a 
programming language that enables 
language-oriented software design. This 
language must facilitate easy creation 
of eDSLs, immediate development of 
components in these newly created 
languages, and integration of compo- 
nents in distinct eDSLs; Racket is avail- 
able at http://racket-lang.org/ 

At the conceptual level, the case for 
LOP is analogous to the ones for object- 
oriented programming and for concur- 
rency-oriented programming.’ The 
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Figure 1. Small language-oriented programming example. 


builds on 


builds on 


extends 


builds on 


video/ffi 


builds on 


builds on builds on 


syntax-parse 


builds on 


racket 


Figure 2. A plain Racket module. 


demo 


#lang racket/base 

(provide 
;; type MaxPath = [Listof Edge] 
;; Natural -> MaxPath 
walk-simplex) 


(require "constraints" graph) 


;; Natural -> MaxPath 
(define (walk-simplex timing) 
(maximizer #:x 2)---) 


Figure 3. A module for describing a simplex shape. 


constraints 


#lang simplex 


;; implicitly provides synthesized function maximizer: 
ry #:x Real -> Real 
2s #:y Real -> Real 


#:variables x y 


3*x+5x* y <= 10 
Se = 6) to WW SS AY) 


Figure 4. Lambda, redefined. 


new-lam 


01 #lang racket 

02 

03 (provide (rename-out [new-lambda lambda])) 
04 

05 (require (for-syntax syntax/parse)) 

OS coc 

@7 ;; Syntax -> Syntax 

08 (define-syntax (new-lambda stx) 

09 (syntax-parse stx 


10 [(new-lambda (x:id (“literal ::) predicate:id) body:expr) 
11 (syntax 

12 (lambda (x) 

13 (unless (predicate x) 

14 (define name (object-name predicate) ) 

15 (error ‘lambda "~a expected, given: ~e" name x)) 
16 body) )1)) 

17 
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former arose from making the creation 
and manipulation of objects syntacti- 
cally simple and dynamically cheap, 
the latter from Erlang’s inexpensive 
process creation and message pass- 
ing. Both innovations enabled new 
ways to develop software and trig- 
gered research projects. The question 
is how our discipline will realize LOP 
and how it will affect the world of soft- 
ware. 

Our decision to develop a new lan- 
guage—Racket—is partly an historical 
artifact and partly due to our desire to 
free ourselves from any unnecessary 
constraints of industrial mainstream 
languages as we investigate LOP. The 
next section spells out how Racket got 
started, how we honed in on LOP, and 
what the idea of LOP implies. 


Principles of Racket 

The Racket project dates to January 
1995 when we started it as a language 
for experimenting with pedagogic 
programming languages.’ Work- 
ing on them quickly taught us that a 
language itself is a problem-solving 
tool. We soon found ourselves devel- 
oping different languages for differ- 
ent parts of the project: a (meta-) lan- 
guage for expressing many pedagogic 
languages, another for specializing 
the DrRacket IDEĘ,” and a third for 
managing configurations. In the end, 
the software was a multilingual sys- 
tem, as outlined earlier. 

Racket’s guiding principle reflects 
the insight we gained: empower pro- 
grammers to create new programming 
languages easily and add them with a 
friction-free process to a codebase. By 
“language,” we mean a new syntax, a 
static semantics, and a dynamic se- 
mantics that usually maps the new syn- 
tax to elements of the host language 
and possibly external languages via a 
foreign-function interface (FFI). For a 
concrete example, see Figure 1 for a di- 
agram of the architecture of a recently 
developed pair of scripting languages 
for video editing’ designed to assist 
people who turn recordings of confer- 
ence presentations into YouTube vid- 
eos and channels. Most of that work is 


b The video language, including an overview of 
the implementation, is available as a use-case 


artifact at https://www2.ccs.neu.edu/racket/ 
pubs/#icfp17-acf 


repetitive—adding preludes and post- 
ludes, concatenating playlists, and su- 
perimposing audio—with few steps 
demanding manual intervention. This 
task calls for a domain-specific script- 
ing language; video is a declarative 
eDSL that meets this need. 

The typed/video language adds a 
type system to video. Clearly, the do- 
main of type systems comes with its own 
language of expertise, and typed/vid- 
eo’s implementation thus uses turn- 
stile,’ an eDSL created for expressing 
type systems. Likewise, the implementa- 
tion of video’s rendering facility calls 
for bindings to a multimedia frame- 
work. Ours separates the binding defini- 
tions from the repetitive details of FFI 
calls, yielding two parts: an eDSL for 
multimedia FFIs, dubbed video/ffi, 
and a single program in the eDSL. Final- 
ly, in support of creating all these eDSLs, 
Racket comes with the syntax parse 
eDSL,’ which targets eDSL creation. 

The LOP principle implies two sub- 
sidiary guidelines: 

Enable creators of a language to en- 
force its invariants. A programming lan- 
guage is an abstraction, and abstrac- 
tions are about integrity. Java, for 
example, comes with memory safety 
and type soundness. When a program 
consists of pieces in different languag- 
es, values flow from one context into 
another and need protection from op- 
erations that might violate their integ- 
rity, as we discuss later; and 

Turn extra-linguistic mechanisms into 
linguistic constructs. A LOP program- 
mer who resorts to extra-linguistic 
mechanisms effectively acknowledges 
that the chosen language lacks expres- 
sive power.: The numerous external 
languages required to deal with Java 
projects—a configuration language, a 
project description language, and a 
makefile language—represent symp- 
toms of this problem. We treat such 
gaps as challenges later in the article. 

They have been developed in a 
feedback loop that includes DrRack- 
et” plus typed,* lazy,* and pedagogi- 
cal languages." 


c Like many programming-language researchers, 
we subscribe to a weak form of the Sapir-Whorf 
hypothesis; see http://docs.racket-lang.org/ 
algol60/ and _https://www.hashcollision.org/ 
brainfudge/ showing how Racket copes with 
obscure syntax. 


Most notably, 
Racket eliminates 
the hard boundary 
between library 
and language, 
overcoming 

a seemingly 
intractable conflict. 
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Libraries and 

Languages Reconciled 

Racket is an heir of Lisp and Scheme. 
Unlike these ancestors, however, Rack- 
et emphasizes functional over impera- 
tive programming without enforcing 
an ideology. Racket is agnostic when it 
comes to surface syntax, accommodat- 
ing even conventional variants (such as 
Algol 60).“ Like many languages, Racket 
comes with “batteries included.” 

Most notably, Racket eliminates the 
hard boundary between library and lan- 
guage, overcoming a seemingly intrac- 
table conflict. In practice, this means 
new linguistic constructs are as seam- 
lessly imported as functions and classes 
from libraries and packages. For exam- 
ple, Racket’s class system and for loops 
are imports from plain libraries, yet 
most programmers use these con- 
structs without ever noticing their na- 
ture as user-defined concepts. 

Racket’s key innovation is a modular 
syntax system,’””° an improvement over 
Scheme’s macro system,™?*” which in 
turn improved on Lisp’s tree-transfor- 
mation system. A Racket module pro- 
vides such services as functions, classes, 
and linguistic constructs. To implement 
them, a module may require the services 
of other modules. In this world of mod- 
ules, creating a new language means 
simply creating a module that provides 
the services for a language. Such a mod- 
ule may subtract linguistic constructs 
from a base language, reinterpret oth- 
ers, and add a few new ones. A language 
is rarely built from scratch. 

Like Unix shell scripts, which specify 
their dialect on the first line, every Racket 
module specifies its language on the first 
line, too. This language specification 
refers to a file that contains a language- 
defining module. Creating this file is all it 
takes to install a language built in Racket. 
Practically speaking, a programmer may 
develop a language in one tab of the IDE, 
while another tab may be a module written 
in the language of the first. Without ever 
leaving the IDE to run compilers, link- 
ers, or other tools, the developer can 
modify the language implementation 
in the first tab and immediately experi- 
ence the modification in the second; 


d See http://docs.racket-lang.org/algol60/, as 
well as well as https://www.hashcollision.org/ 
brainfudge/, which shows how Racket copes 
with obscure syntax. 
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that is, language development is a fric- 
tion-free process in Racket. 

In the world of shell scripts, the first- 
line convention eventually opened the 
door to a slew of alternatives to shells, 
including Perl, Python, and Ruby. The 
Racket world today reflects a similar 
phenomenon, with language libraries 
proliferating within its ecosystem: 
racket/base, the Racket core lan- 
guage; racket, the “batteries includ- 
ed” variant; and typed/racket, a 
typed variant. Some lesser-known ex- 
amples are datalog and a web-serv- 
er language.” When precision is 
needed, we use the lowercase name of 
the language in typewriter font; other- 
wise we use just “Racket.” 

Figure 2 is an illustrative module. Its 
first lne—pronounced “hash lang rack- 
et base”—says it is written in racket / 
base. The module provides a single 
function, walk-simplex. The accom- 
panying line comments—introduced 
with semicolons—informally state a 
type definition and a function signature 
in terms of this type definition; later, we 
show how developers can use typed/ 
racket to replace such comments with 
statically checked types, as in Figure 5. 
To implement this function, the mod- 
ule imports functionality from the con- 
straints module outlined in Figure 3. 
The last three lines of Figure 2 sketch 
the definition of the walk-simplex 
function, which refers to the maximiz- 
ex function imported from constraints. 

The "constraints" module in Fig- 
ure 3 expresses the implementation of its 
only service in a domain-specific lan- 
guage because it deals with simplexes, 
which are naturally expressed through a 
system of inequalities. The module’s 
simplex language inherits the line-com- 
ment syntax from racket/base but 
uses infix syntax otherwise. As the com- 
ments state, the module exports a single 
function, maximizer, which consumes 
two optional keyword parameters. When 
called as (maximizer #:xn), as in Figure 
2, it produces the maximal y value of the 
system of constraints. As in the lower half 
of Figure 3, these constraints are speci- 
fied with conventional syntax. 

In support of this kind of program- 
ming, Racket’s modular syntax system 
benefits from several key innovations. 
A particularly illustrative one is the 
ability to incrementally redefine the 
meaning of existing language con- 
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structs via the module system. It allows 
eDSL creators to ease their users into a 
new language by reusing familiar syn- 
tax, but reinterpreted. 

Consider lambda expressions, for 
example. Suppose a developer wishes 
to equip a scripting language (such as 
video) with functions that check 
whether their arguments satisfy speci- 
fied predicates. Figure 4 shows the ba- 
sic idea: 

line 01 The module uses the racket 
language. 

line 03 It exports a defined com- 
pile-time function, new-lambda, un- 
der the name lambda, which is over- 
lined in the code to mark its origin as 
this module. 

line 05 Here, the module imports 
tools from a library for creating robust 
compile-time functions conveniently.’ 

line 07 The comment says a function 
on syntax trees follows. 

line 08 While (define (£ x) ...) 
introduces an ordinary function f of x, 
(define-syntax (c stx) . . . ) cre- 
ates the compile-time function c with a 
single argument, stx. 

line 09 As with many functional lan- 
guages, Racket comes with pattern- 
matching constructs. This one uses 
syntax-parse from the library men- 
tioned earlier. Its first piece specifies 
the to-be-matched tree (stx); the re- 
mainder specifies a series of pattern- 
responses clauses. 

line 10 This pattern matches any 
syntax tree with first token as new- 
lambda followed by a parameter speci- 
fication and a body. The annotation 
:id demands that the pattern variables 
x and predicate match only identifi- 
ers in the respective positions. Like- 
wise, :expr allows only expressions to 
match the body pattern variable. 

line 11 A compile-time function syn- 
thesizes new trees with syntax. 

line 12 The generated syntax tree is a 
lambda expression. Specifically, the 
function generates an expression that 
uses lambda. The underline in the 
code marks its origin as the ambient 
language, here racket. 

other lines Wherever the syntax sys- 
tem encounters the pattern variables 
x, predicate, and body, it inserts the 
respective subtrees that match x, predi- 
cate, and body. 

When another module uses “new- 
lam” as its language, the compiler 


elaborates the surface syntax into the 


core language like this 

(lambda (x :: integer?) (+ x 1)) 
-elaborates to— 

lambda (x :: integer?) (+ x 1)) 


-elaborates to— 
(new-lambda (x :: 
-elaborates to—> 
(lambda (x) 
(unless (integer? x) 
<elided error reporting>) 


(+ x 1)) 


integer?) (+ x 1)) 


The first elaboration step resolves 
lambda to its imported meaning," or 
lambda. The second reverses the “re- 
name on export” instruction. Finally, 
the new-lambda compile-time function 
translates the given syntax tree into a 
racket function. 

In essence, Figure 4 implements a 
simplistic precondition system for one- 
argument functions. Next, the language 
developer might wish to introduce mul- 
tiargument lambda expressions, add a 
position for specifying the post-condi- 
tion, or make the annotations optional. 
Naturally, the compile-time functions 
could then be modified to check some or 
all of these annotations statically, even- 
tually resulting in a language that resem- 
bles typed/racket. 


Sound Cooperation 

Between Languages 

A LOP-based software system consists 
of multiple cooperating components, 
each written in domain-specific lan- 
guages. Cooperation means the com- 
ponents exchange values, while “mul- 
tiple languages” implies these values 
are created in distinct languages. In 
this setting, things can easily go wrong, 
as demonstrated in Figure 5 with a toy 


scenario. On the left, a module written 
in typed/racket exports a numeric 
differentiation function. On the right, 
a module written in racket imports 
this function and applies it in three 
different ways, all illegal. If such il- 
legal uses of the function were to go 
undiscovered, developers would not 
be able to rely on type information for 
designing functions or for debugging, 
nor could compilers rely on them for 
optimizations. In general, cooperat- 
ing multilingual components must 
respect the invariants established by 
each participating language. 

In the real world, programming lan- 
guages satisfy a spectrum of guaran- 
tees about invariants. For example, C++ 
is unsound. A running C++ program 
may apply any operation to any bit pat- 
tern and, as long as the hardware does 
not object, program execution contin- 
ues. The program may even terminate 
“normally,” printing all kinds of output 
after the misinterpretation of the bits. In 
contrast, Java does not allow the misrep- 
resentation of bits but is only somewhat 
more sound than C++.’ ML improves on 
Java again and is completely sound, 
with no value ever manipulated by an 
inappropriate operation. 

Racket aims to mirror this spectrum 
of soundness at two levels: language im- 
plementation itself and cooperation be- 
tween two components written in differ- 
ent embedded languages. First consider 
the soundness of languages. As the liter- 
ature on domain-specific languages sug- 
gests,” such languages normally evolve 
in a particular manner, as is true for the 
Racket world, as in Figure 6. A first imple- 
mentation is often a thin veneer over an 
efficient C-level API. Racket developers 


Figure 5. Protecting invariants. 


#lang typed/racket TR 


(provide diff) 


Co ditt 
((Real -> Real) 
-> 
(Real -> Real))) 
(define (diff f) 
(lambda (x) 
(define lo (f (- x eps))) 
(define hi (f (+ x eps))) 
W © Imi tle) 
(2 eps) 


#lang racket RR 


(require "TR.rkt") 


;; scenario 1 

(dakarani, 

;; scenario 2 

(define (f-bool x) 
#true) 

(diff f-bool) 

3; scenario 3 
(define (f-char x) 
(string x x)) 

(diff f-str) 
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create such a veneer with a foreign inter- 
face that allows parenthesized C-level 
programming.’ Programmers can refer to 
aC library, import functions and data struc- 
tures, and wrap these imports in Racket val- 
ues. Figure 7 illustrates the ideawith a sketch 
of a module; video’s initial implemen- 
tation consisted of just such a set of 
bindings to a video-rendering frame- 
work. When a racket/base module 
imports the ffi/unsafe library, the 
language of the module is unsound. 

A language developer who starts with 
an unsound eDSL is likely to make it 
sound as the immediate next step. To 
this end, the language is equipped with 
runtime checks similar to those found 
in dynamically typed scripting languag- 
es to prevent the flow of bad values to 
unsound primitives. Unfortunately, 
such protection is ad hoc, and, unless 
developers are hypersensitive, the error 
messages may originate from inside the 
library, thus blaming some racket/ 
base primitive operation for the error. 
To address this problem, Racket comes 
with higher-order contracts with which 
a language developer might uniformly 
protect the API of a library from bad val- 
ues. For example, the video/ffi lan- 
guage provides language constructs for 
making the bindings to the video-ren- 
dering framework safe. In addition to 
plain logical assertions, Racket’s devel- 
opers are also experimenting with con- 
tracts for checking protocols, especially 
temporal ones.? The built-in blame 
mechanism of the contract library en- 
sures sound blame assignment.'° 

Finally, a language developer may 
wish to check some logical invariants 
before the programs run. Checking 
simple types is one example, though 


Figure 6. Hardening a module. 


racket/base 
with ffi/unsafe 
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racket/base 


racket/base 
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other forms of static checking are also 
possible. The typed/video language 
illustrates this point with a type system 
that checks the input and output types 
of functions that may include numeric 
constraints on the integer arguments; 
as a result, no script can possibly ren- 
der a video of negative length. Like- 
wise, typed/racket is a typed variant 
of (most of) racket. 

Now consider the soundness of coop- 
erating languages. It is again up to the 
language developer to anticipate how 
programs in this language interact with 
others. For example, the creator of 
typed/video provides no protection 
for its programs. In contrast, the cre- 


ators of typed/racket intended the 
language to be used in a multilingual 
context; typed/racket thus compiles 
the types of exported functions into the 
higher-order contracts mentioned. 
When, for example, an exported func- 
tion must always be applied to integer 
values, the generated contract inserts a 
check that ensures the “integerness” of 
the argument at every application site 
for this function; there is no need to in- 
sert such a check for the function’s re- 
turn points because the function is stati- 
cally type checked. For a function that 
consumes an integer-valued function, 
the contract must ensure the function 
argument always returns an integer. In 


Figure 7. A Racket module using the foreign-function interface. 


#lang racket/base 


(provide 


ffi 


;; [Vectorof [Vectorof Real]] -> [Vectorof Real] 


simplex) 
(require ffi/unsafe) 


(define (simplex M) 
- (ffi-simplex-set ---) ---) 


(define lib-simplex (ffi-lib "./coin-Clp/lib/libClp")) 


(define ffi-simplex-set 


(get-ffi-obj "simplex" lib-simplex (_fun _bytes -> _void))) 


Figure 8. A sketch of an industrial example of language-oriented programming. 


scene-description 


elaborate to elaborate to 


music-score 


transitions 


elaborate to 


racket/ffi 


embed in 
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general, a contract wraps exported val- 
ues with a proxy” that controls access to 
the value. The idea is due to Matthews 
and Findler,”? while Tobin-Hochstadt’s 
and Felleisen’s Blame Theorem” 
showed that if something goes wrong 
with such a mixed system, the runtime 
exception points to two faulty compo- 
nents and their boundary as the source 
of the problem.” In general, Racket sup- 
plies a range of protection mechanisms, 
and a language creator can use them to 
implement a range of soundness guar- 
antees for cooperating eDSLs. 


Universality vs. Expressiveness 
Just because a general-purpose lan- 
guage can compute all partial-recursive 
functions, programmers cannot neces- 
sarily express all their ideas about pro- 
grams in this language.’ This point is 
best illustrated through an example. 
So, imagine the challenge of building 
an IDE for a new programming lan- 
guage in the very same language. Like 
any modern IDE, it is supposed to en- 
able users to compile and run their 
code. If the code goes into an infinite 
loop, the user must be able to termi- 
nate it with a simple mouse click. To 
implement this capability in a natu- 
ral? manner, the language must inter- 
nalize the idea of a controllable proc- 
ess, a thread. If it does not internalize 
such a notion, the implementer of the 
IDE must step outside the language 
and somehow re-use processes from 
the underlying operating system. 

For a programming language re- 
searcher, “stepping outside the lan- 
guage” signals failure. Or, as Ingalls” 
said, “[an] operating system is a collec- 
tion of things that don’t fit into a lan- 
guage[; t]here shouldn’t be one.” We, 
Racket creators, have sought to identify 
services Racket borrows from the sur- 
rounding operating system and assimi- 
late them into the language itself. Here 
are three sample constructs for which 
programmers used to step outside of 
Racket but no longer need to: 

Sandboxes. That restrict access to 
resources; 

Inspectors. That control reflective 
capabilities; and 


e Analternative is to rewrite the entire program 
before handing it to the given compiler, ex- 
actly what distinguishes “expressiveness” 
from “universality.” 


Custodians. That manage resources 
(such as threads and sockets). 

To understand how inclusion of 
such services helps language designers, 
consider a 2014 example, the shill 
language.” Roughly speaking, shill is 
a secure scripting language in Racket’s 
ecosystem. With shill, a developer ar- 
ticulates fine-grain security and re- 
source policies—along with, say, what 
files a function may access or what bi- 
naries the script may run—and the lan- 
guage ensures these constraints are sat- 
isfied. To make this concrete, consider 
a homework server to which students 
can submit their programs. The in- 
structor might wish to run an auto- 
grade process for all submissions. Us- 
ing a shill script, the homework 
server can execute student programs 
that cannot successfully attack the serv- 
er, poke around in the file system for 
solutions, or access external connec- 
tions to steal other students’ solutions. 
Naturally, shill’s implementation 
makes extensive use of Racket’s means 
of running code in sandboxes and har- 
vesting resources via custodians. 


State of Affairs 

The preceding sections explained how 
Racket enables programmers to do the 
following: 

Create languages. Create by way of 
linguistic reuse for specific tasks and 
aspects of a problem; 

Equip with soundness. Equip a lan- 
guage with almost any conventional 
level of soundness, as found in ordi- 
nary language implementations; and 

Exploit services. Exploit a variety of 
internalized operating system services 
for constructing runtime libraries for 
these embedded languages. 

What makes such language-orient- 
ed programming work is “incremental- 
ity,” or the ability to develop languages 
in small pieces, step by step. If conven- 
tional syntax is not a concern, develop- 
ers can create new languages from old 
ones, one construct at a time. Like- 
wise, they do not have to deliver a 
sound and secure product all at once; 
they can thus create a new language as 
a wrapper around, say, an existing C- 
level library, gradually tease out more 
of the language from the interface, 
and make the language as sound or se- 
cure as time permits or a growing user 
base demands. 


Racket borrows 
from the 
surrounding 
operating system 
and assimilates 
such extra-linguistic 
mechanisms into 
the language itself. 
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Moreover, the entire process takes 
place within the Racket ecosystem. A 
developer creates a language as a Rack- 
et module and installs it by “import- 
ing” it into another module. This tight 
coupling has two implications: the de- 
velopment tools of the ecosystem can 
be used for creating language modules 
and their clients; and the language be- 
comes available for creating more lan- 
guages. Large projects often employ a 
tower involving a few dozen languages, 
all helping manage the daunting com- 
plexity in modern software systems. 

Sony’s Naughty Dog game studio has 
created just such a large project, actual- 
ly a framework for creating projects. 
Roughly speaking, Sony’s Racket-based 
architecture provides languages for de- 
scribing scenes, transitions between 
scenes, scores for scenes, and more. Do- 
main specialists use the languages to 
describe aspects of the game. The Rack- 
et implementation composes these do- 
main-specific programs, then compiles 
them into dynamically linked libraries 
for a C-based game engine; Figure 8 
sketches the arrangement graphically. 

Racket’s approach to language-ori- 
ented programming is by no means per- 
fect. To start with, recognizing when a 
library should become a language re- 
quires a discriminating judgment call. 
The next steps require good choices in 
terms of linguistic constructs, syntax, 
and runtime primitives. 

As for concrete syntax, Racket cur- 
rently has strong support for typical, 
incremental Lisp-style syntax develop- 
ment, including traditional support 
for conventional syntax, or generating 
lexers and parsers. While traditional 
parsing introduces the natural separa- 
tion between surface syntax and mean- 
ing mentioned earlier, it also means 
the development process is no longer 
incremental. The proper solution 
would be to inject Racket ideas into a 
context where conventional syntax is 
the default.‘ 


f Language workbenches (such as Spoofax?’) 
deal with conventional syntax for DSLs but do 
not support the incremental modification of 
existing languages. A 2015 report” suggests, 
however, these tool chains are also converg- 
ing toward the idea of language creation as 
language modification. We conjecture that, 
given sufficient time, development of Racket 
and language workbenches will converge on 
similar designs. 


COMMUNICATIONS OF THE ACM 69 


contributed articles 


As for static checking, Racket forc- 
es language designers to develop such 
checkers wholesale, not incremental- 
ly. The type checker for typed/rack- 
et looks like, for example, the type 
checker for any conventionally typed 
language; it is a complete recursive- 
descent algorithm that traverses the 
module’s representation and algebra- 
ically checks types. What Racket de- 
velopers really want is a way to attach 
type-checking rules to linguistic con- 
structs, so such algorithms can be syn- 
thesized as needed. 

Chang et al. probably took a first 
step toward a solution for this prob- 
lem and have thus far demonstrated 
how their approach can equip a DSL 
with any structural type system in an 
incremental and modular manner. A 
fully general solution must also cope 
with substructural type systems (such 
as the Rust programming language) 
and static program analyses (such as 
those found in most compilers). 

As for dynamic checking, Racket 
suffers from two notable limitations: 
On one hand, it provides the building 
blocks for making language coopera- 
tion sound, but developers must cre- 
ate the necessary soundness harness- 
es on an ad hoc basis. To facilitate the 
composition of components in differ- 
ent languages, Racket developers 
need both a theoretical framework 
and abstractions for the partial auto- 
mation of this task. On the other hand, 
the available spectrum of soundness 
mechanisms lacks power at both 
ends, and how to integrate these pow- 
ers seamlessly is unclear. To achieve 
full control over its context, Racket 
probably needs access to assembly 
languages on all possible platforms, 
from hardware to browsers. To realize 
the full power of types, typed/rack- 
et will have to be equipped with de- 
pendent types. For example, when a 
Racket program uses vectors, its cor- 
responding typed variant type-checks 
what goes into them and what comes 
out, but like ML or Haskell, indexing 
is left to a (contractual) check in the 
runtime system. Tobin-Hochstadt and 
his Typed Racket group are working 
on first steps in this direction, focus- 
ing on numeric constraints,” similar 
to Xi’s and Pfenning’s research.” 

As for security, the Racket project is 
still looking for a significant break- 
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through. While the shill team was 
able to construct the language inside 
the Racket ecosystem, its work ex- 
posed serious gaps between Racket’s 
principle of language-oriented pro- 
gramming and its approach to enforc- 
ing security policies. It thus had to al- 
ter many of Rackets security 
mechanisms and invent new ones. 
Racket must clearly make this step 
much easier, meaning more research 
is needed to turn security into an inte- 
gral part of language creation. 

Finally, LOP also poses brand- 
new challenges for tool builders. An 
IDE typically provides tools for a sin- 
gle programming language or a fam- 
ily of related languages, including 
debuggers, tracers, and profilers. 
Good tools communicate with devel- 
opers in terms of the source lan- 
guage. Due to its very nature, LOP 
calls for customization of such tools 
to many languages, along with their 
abstractions and invariants. We have 
partially succeeded in building a tool 
for debugging programs in the syn- 
tax language,’ have the foundations 
of a debugging framework,’ and 
started to explore how to infer scop- 
ing rules and high-level semantics 
for newly introduced, language-level 
abstractions.**** Customizing these 
tools automatically to newly created 
(combinations of) languages re- 
mains an open challenge. 


Conclusion 

Programming language research is 
short of its ultimate goal—provide 
software developers tools for for- 
mulating solutions in the languages 
of problem domains. Racket is one 
attempt to continue the search for 
proper linguistic abstractions. While 
it has achieved remarkable success 
in this direction, it also shows that 
programming-language research has 
many problems to address before the 
vision of language-oriented program- 
ming becomes reality. 
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Older adults consistently reject 
digital technology even when designed 
to be accessible and trustworthy. 


| BY BRAN KNOWLES AND VICKI L. HANSON 


The Wisdom 
of Older 
Technology 
(Non)Users 


IT IS IMPOSSIBLE not to notice that many of the questions 
driving research on technology use by older adults 
today are the same as those at the forefront of aging 
and accessibility research 20 years ago. Back then, 
computers were predominantly large desktops, social 
media was still on the horizon, and mobile phones 
were large and not (yet) smart. Older adults had little 
presence on the Internet. Today, devices have changed 
and older adults are increasingly online.” They do, 
however, continue to lag in broadband use, breadth 
of applications used, and time online.” Typical 
reports reflect they have little interest in social media 
(other than staying in touch with family) and 
are skeptical of online financial transactions.” 
Clearly, the problem of older adults’ comparatively 
limited technology use has not gone away despite 
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a more tech-savvy group of people ag- 
ing into the “older adult” category. 
According to the most recent data, 
from 2014-2016," predictions of a 
forthcoming “Silver Tsunami” of re- 
tired workers—a cohort now accus- 
tomed to digital technology access in 
their working lives and therefore able 
to take full advantage of the Internet— 
have not come true.’ Indeed, the over- 
whelming perception remains of older 
generations being incapable of or other- 
wise resistant to using technology. A 
“digital divide” between old and young 
is potentially more disabling now com- 
pared to 20 years ago, given the push 
for a more fully realized digital society. 
Digital technologies today are so essen- 
tial to daily life that it is reasonable to 
ask whether older adults’ inability to 
access online-only government servic- 
es may soon be included among the 
precipitating factors in older adults 
moving into assisted living. 

While we see the emergence of calls 
for a more holistic view of how to de- 
sign technology for older adults*'*"* 
than was the case 20 years ago, inter- 
ventions to get older adults online 
commonly focus on age-related de- 
clines (such as vision, hearing, cogni- 
tion, and dexterity) as the principal 
barriers to technology adoption. These 
interventions are often senior-friendly 
variants or adaptations to make the 


key insights 


m Older adults’ non-use of digital 
technologies is purposeful and thus 
instructive for identifying problematic 
consequences of these technologies 
for the population at large. 


m Looking beyond traditional thinking 
on how to make it easier for older adults 
to use technology, we identify factors 
relating to responsibility, values, 
and cultural expectations contributing 
to older adults’ resistance to 
digital technologies. 


m These factors emerge from the cultural 
changes driven by technological 
innovation, so will likely remain barriers 
to adoption, even as younger generations 
age, unless new technologies are 
designed with sensitivity to values 
fostered through such experience. 
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technology more accessible.® How- 
ever, older adults are considerably less 
likely than their younger counterparts 
with a disability to adopt assistive tools 
designed specifically for them,’ sug- 
gesting perhaps they do not view the 
conditions of aging as disabling,” or 
(or in addition) their resistance to 
technology adoption is not solely or 
even primarily rooted in usability/ac- 
cessibility issues (see the sidebar 
“Who Is an ‘Older Adult’?”). 

Our interviews with older adults re- 
veal they are often unwilling to ac- 
knowledge that their lives would be en- 
riched through digital technologies, 
whether or not they were made accessi- 
ble. It is this attitude concerning tech- 
nology that intrigues us. Given that the 
kinds of technologies and applications 
older adults are receptive to or averse to 
varies by individual, older users do not 
appear to be identifying inherent de- 


sign failings of any specific tools. Are 
there bigger-picture issues with “digital 
society” that lead older adults to reject 
particular technologies? If so, are they 
likely to be of continuing relevance 
when future generations age into older 
adults? And is there anything that can 
be done to address them? 

Here, we draw from our own recent 
research interviews and a substantial 
body of experience working with older 
adults to describe three factors—re- 
sponsibility, values, and cultural ex- 
pectations—that contribute to older 
adults’ resistance to the digital profi- 
ciency that is ostensibly required to be 
fully participating, independent citi- 
zens in our increasingly digital society. 
These factors suggest new directions 
for aging and accessibility research, 
while also being more broadly instruc- 
tive for creating a digital society that 
works for everyone. 
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Older Users’ Experience 

While there is reason for optimism 
concerning older adults’ adoption of 
technology when looking at their in- 
creasing online participation,’ that 
participation is qualitatively different 
from younger users, being more limit- 
ed in time and variety of experiences.” 
We sought to better understand the 
underlying reasons for these differenc- 
es in a series of group interviews with 
a total of 14 post-retirement commu- 
nity-dwelling individuals, ages 66 to 
86, around Dundee, Scotland, who we 
drew from an established older-adult 
participant pool.* While these discus- 
sions revealed some physical and cog- 
nitive decline among participants, we 
were not aware of any having physical 
or cognitive deterioration outside the 
typical range for their age. The focus 
groups followed a semi-structured 
format that allowed significant con- 
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Who Is an ‘Older Adult’? 


Various age groupings have been used over the years to define “older.” The fact is, 
aging is a process. Governments define age for pensions and Social Security, and 
various services offer senior rates based on age. From an individual’s point of view, 
however, age is largely a state of mind. A person who is 60 may feel old, while another 
who is 80 may not. Research on the use of technology by older adults has varied in 
terms of the cut points for age categorization. Over age 50 is often used, though a less- 
controversial cutoff would be age 65. As noted by one of our participants, people do not 
suddenly wake up one day and find themselves “old,” nor do they wake up to find they 
are no longer able to use technology. Age itself is not the sole criterion determining 
technology usage behavior, as there is great variability in adoption and use by those 
conventionally categorized as “older adults.” So while some broad assertions can be 
made about older adults in comparison with, say, younger adults, it is always important 
for researchers to be aware of the ways older adults are individuals. 


versational steer by participants. Our 
conversations focused on partici- 
pants’ use of the Internet, what they 
did not use it for, and what aspects of 
digital technologies they did or did 
not trust. 

Overall, participants were open to 
using at least a limited set of applica- 
tions. Email and general Web brows- 
ing were used by all. Social network- 
ing, travel booking, online shopping, 
and online banking were used by 
some, often to the point of depen- 
dence. Notably, participants did not 
consider learning and using technolo- 
gy rewarding in and of itself. Many also 
talked about consciously avoiding “get- 
ting caught up in” digital life, viewing 
the abundance of applications and fea- 
tures as potential diversions from 
more rewarding activities. Social net- 
working often fell into the time-wast- 
ing category, with many noting the in- 
sipidness of the content on Facebook, 
though some found it useful (and even 
enjoyable) for keeping in contact with 
family. For those in the former camp, 
there was a strong aversion both to the 
idea of one’s life being an “open book” 
and being glued to one’s mobile 
phone—trends they found deeply trou- 
bling in younger generations. 

Besides the limited range of tools 
the participants adopted, the most 
striking characteristic of their reported 
use was lingering discomfort. “Al- 
though I use the computer, I find it 
quite frightening,” admitted one wom- 
an. “The reason I find it frightening is 
that I don’t understand it. And I don’t 
know how to put things right.” They de- 
scribed feeling much more competent, 
and therefore more comfortable, with 
analog equivalents (such as paper ar- 
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chives and paper calendars). They de- 
scribed worrying about and planning 
for the eventuality of their computer 
“blowing up.” Security concerns were 
omnipresent. Even tools used regularly 
were not trusted per se. Rather, when 
they acknowledged significant benefits 
of specific tools, they used them in 
spite of unresolved concerns regarding 
their trustworthiness. 

While none of what we found may 
be surprising, it is worth emphasizing 
that this pattern of use paints a picture 
that clashes with the dominant cultur- 
al narrative of older adults being resis- 
tant to all digital technologies by de- 
fault. It also provides a more nuanced 
view of recent claims that uptake of 
digital technologies is rising among 
older adults; while a much greater per- 
centage of older adults is online than a 
decade ago, they are very discriminat- 
ing in what they are willing to do (see 
the sidebar “Why Focus on Nonuse?”). 
And, as we intend to show, much of 
what underlies the resistance is inher- 
ent in aspects of being older that are 
unlikely to change as new generations 
reach retirement age. 


Underlying Problems 

Here, we explore what underlies older 
adults’ resistance to the many digital 
tools that would ostensibly provide so 
many benefits to them (such as easing 
loneliness and isolation, being in con- 
trol of decisions that affect them, liv- 
ing independently, and participating 
in and contributing to society).1 We 
identify three clusters of factors that 
can contribute to resistance, though 
note their relevance and how their in- 
teractions play out differently within 
each individual. 
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Perception of risk. Upon retiring, 
people in the industrialized West lose 
an important training ground (and 
motivation) for developing compe- 
tence with emerging technologies. 
One approach to addressing it is to cre- 
ate IT drop-in centers or training 
courses tailored to older users. Such 
resources are valuable for older adults 
seeking information relating to pre- 
cise steps for executing a task, or pro- 
cedural knowledge, as they so often 
ask for, as explored by Leung et al.,”* 
but typically do not strengthen their 
conceptual grounding in ways that en- 
able them to execute unfamiliar tasks. 
As a result, existing training opportu- 
nities for older adults do little to affect 
generalized anxiety about not “under- 
standing” technologies. Most of our 
participants seemed to worry they did 
not know enough to use the tools ef- 
fectively and responsibly and did not 
know how they would know when they 
did know enough. 

A contributor to these feelings of in- 
competence was that in the past our 
participants would seek out trained 
professionals to accomplish specialty 
tasks for them; for example, they would 
go to a mortgage advisor to get advice 
on choosing a mortgage, to a travel 
agent to arrange hotels and flights, or 
to a banker to handle the transfer of 
money. A consequence of having more 
immediate “control”' over these tasks 
is having to take on new responsibili- 
ties, which some felt equated to “hav- 
ing a part-time job,” requiring hours in 
front of a computer screen. Though 
this may surprise many adults in full- 
time employment, older adults’ lives 
are still extremely busy, with clubs, ac- 
tivities, commitments to family and 
friends, and more mundane chores 
and responsibilities (such as home re- 
pair, medical appointments, and shop- 
ping). They simply do not have time to 
learn how to use online services well 
enough to use them with confidence. 

In terms of online banking, a com- 
mon response is, “I don’t trust it,” as in 
Vines et al.” But upon further probing, 
it becomes clear that these older adults 
do not trust themselves. They lack con- 
fidence in their ability to use the tools 
and fear the consequences of making 
mistakes. What happens to their mon- 
ey if they press the wrong button? If 
they are hacked, will they be held ac- 


countable for not following security 
protocols they ought to have known? 
They are right to worry in both cases. It 
is unclear what mistakes might be cor- 
rectable and quite likely that more per- 
sonal responsibility will be assumed as 
expectations for digital proficiency 
rise. We can hardly fault older adults 
for deciding it unwise to use any tool— 
online banking and shopping or sub- 
mitting official government forms— 
without learning to use it in ways that 
ensure their safety and security. 

The assurance of a clearly under- 
standable safety net when conducting 
digital activities is essential for older 
adults to adopt tools, more so (with on- 
line financial transactions) because 
recovery from being defrauded of their 
financial savings would be much more 
difficult. In addition to developing a 
legal scaffolding for such a safety net 
or policies forcing businesses to as- 
sume the costs of user error, including 
accidental breaches of security proto- 
cols, there is important work to be 
done in devising mechanisms and 
user interactions that make data sys- 
tems more (if never entirely) foolproof. 
Acknowledging that individuals often 
adopt a tool despite not trusting it, it is 
critical to design mechanisms that 
help manage user anxieties (such as 
providing necessary feedback and re- 
assurances throughout the interac- 
tions) to ensure effective use and pre- 
vent panic and abandonment. 

The value proposition. Not often 
discussed in the literature on aging 
and accessibility is that choosing not to 
use new technology can be a seen as a 
rational decision for older adults, de- 
pending on their resources and needs; 
for example, among those who live on 
limited pensions, it may be difficult to 
justify the financial outlay for broad- 
band alone. And in the case of a service 
like online shopping, while it would 
seemingly provide numerous bene- 
fits—saving time and money or not 
having to travel—it would also replace 
an important social activity for those 
who shop (sometimes daily) purely for 
the social benefit. Indeed, older adults 
we interviewed work hard to strike a 
positive balance between online profi- 
ciency and cultivation of rich offline 
social worlds. 

A surprisingly common feature of 
our conversations with older adults 


about reasons for resisting certain 
technologies was their strong sense of 
social responsibility. They worry, for 
example, that online shopping takes 
business away from local shops, mean- 
ing there will soon be no vibrant town 
centers in which to socialize with 
friends. They worry in particular that if 
they do not make an effort to attend, 
say, physical shops and banks, the peo- 
ple who work there will soon be out of 
their jobs. One participant expressed 
sympathy for the “delightful” recep- 
tionist at the sports facility whose job 
was replaced by an app for booking 
classes. Another said she would never 
pay her road tax or anything else on- 
line, not because of concern about us- 
ing the technology but, “I just think I 
want to keep the post office open.” 

If and when some of the digital 
technologies older adults resist be- 
come essential to daily (indepen- 
dent) living, it will be important to 
explore that resistance to under- 
stand what might motivate or en- 
able them to use the technologies in 
the future. While older adults’ men- 
tal trade-offs—weighing the per- 
ceived benefits against the financial 
cost of the technologies, the time it 
will take to learn to use them, the so- 
cial interactions they may lose, and 
the jobs the technologies replace— 
will be evaluated differently by differ- 
ent people, there are at least three 
things that must change to tip the 
scales. Broadband cannot continue 
to be charged at rates that are prohib- 
itive for many older adults; this needs 
to be treated by government regula- 
tors and access providers as a basic 
need like electricity.” Moreover, with 
loneliness being such a common 
characteristic of the older-adult expe- 
rience, greater attention needs to be 
paid to ensuring digital engagements 
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do not replace social interactions but 
instead where possible facilitate new 
social and community-building op- 
portunities. Today, social networking 
systems (such as Facebook) fail to broad- 
ly address this need for older adults. Part 
of getting older adults online will also be 
developing strategies for creating new, 
good-quality jobs in place of those the 
digital technologies make redundant— 
something that, regardless of older 
adults’ attitudes toward technology, 
needs attention.”® 

Freedom of low expectations. The 
notion that aging per se leads to tech- 
nology abandonment does not with- 
stand scrutiny. And yet older adults 
themselves are often the worst perpet- 
uators of the myth, quick to excuse 
their disinterest in a given tool with 
the seemingly self-explanatory line, 
“Tm too old.” It is worth considering, 
then, what older adults might gain 
from this stereotype. 

We (the authors) have come to un- 
derstand that it affords older adults 
the privilege of taking quiet personal 
stands against the aspects of technol- 
ogy they find worrying, threatening, or 
plain annoying. For example, a com- 
mon justification for not using Face- 
book and other social media is the cy- 
ber-bullying, “stalking” behavior, and 
“narcissism” they seem to encourage. 
One older adult we talked to said, “I 
don’t do Facebook. Having been a 
teacher, I think it’s got loads of prob- 
lems for young people. [T]Jhere’s pres- 
sure put on them if they haven’t got 
500 friends, and I think there’s all 
sorts of online bullying, and I thought, 
‘No, this is not really for me.’” This is a 
purely political stand; this woman 
would not be a victim of the problems 
she is raising, and because there is no 
expectation she would use Facebook, 
she can easily act on her principles. 


Why Focus on Nonuse? 


There is a tendency in public discourse and in computing research to view older 
nonusers as “problematic.” And it is true that by resisting technologies and/or not 
being able to use them as intended by their designers, older adults impose challenges 
for realizing the technocentric vision of a fully digital society. But considering nonuse 
from the perspective of not doing something obscures the fact that nonuse is “active, 
meaningful, motivated, considered, structured, specific, nuanced, directed, and 
productive.” Understanding what alternative meaning is conveyed when individuals 
selectively choose nonuse enables reflection on the meaning(s) implicated or 
embodied by the technologies the nonusers are rejecting. 
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Likewise, when older adults say, 
“Maybe I’m just in a generation where 
Td rather go into a bank and speak to 
someone face-to-face,” they may sim- 
ply be playing into the common view 
that they are creatures of habit with 
nothing better to do as a cover for their 
prevailing sense of social responsibili- 
ty. Playing the “age card” to justify re- 
jection of technology is one way older 
adults take a stand while minimizing 
the risk of doing so. 

It is also often the case that older 
adults may simply prefer so-called tra- 
ditional forms of communication, 
face-to-face, allowing them to ask for 
the help most of us wish we were enti- 
tled to. When it comes to, say, filling in 
a government form, we have presum- 
ably all had the experience this older 
woman described to us, saying, “I 
think in my case what happens with 
the computer is, you’re filling out this 
bit and this bit and this bit, and some- 
times you get so confused as to what 
they’re really asking you.” There is an 
expectation that younger adults 
should be able to figure it out them- 
selves, and most will persevere as ex- 
pected; whereas this woman was em- 
powered by the stereotype to reject the 
unreasonable demands being placed 
on her on the basis that she was “too 
old.” She preferred to walk into her lo- 
cal council office and demand some- 
one answer her question. 

We stress that even if future genera- 
tions of older adults are more digitally 
adept than today’s older adults, con- 
tinual technological change alone 
means they will almost certainly re- 
main less adept than their younger 
contemporaries. The older adults we 
talked to often spoke with awe (and an 
occasional hint of jealousy) about how 
easily their children, and especially 
their grandchildren, use technology. 
There are clear physical and cognitive 
bases for these observed differenc- 
es,”°° but there are social ones as well, 
namely that children and younger 
adults benefit from informal training 
by their peers and further experiential 
bases (such as the fact that many of the 
technologies older adults are most fa- 
miliar with are becoming “old-fash- 
ioned”). It would make sense if older 
people’s reaction to these observa- 
tions is to not even try to “compete,” as 
it were, by working to be as proficient 
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with technology as younger people. 
This excuse of being “too old” will thus 
continue to be a professed barrier to 
adoption for a certain segment of the 
older-adult population. But when and 
why older adults choose to play the age 
card may provide clues as to what is- 
sues they are protesting, and thus what 
social side-effects of digital technolo- 
gies must still be addressed. 


Factors Influencing 

Technology Adoption 

Following decades of research focused 
on getting older adults to adopt tech- 
nology, there has not been enough 
progress to ensure older adults are 
sufficiently adept for navigating a so- 
ciety in which critical services are in- 
creasingly “online only.” We suggest 
this is because the usability and ac- 
cessibility of these tools, despite be- 
ing the focus of most research, are not 
the most salient barriers to adoption. 
As Zajicek said,” when there is some- 
thing they want to do, nothing will get 
in the way of older adults using tech- 
nology. This means the more appro- 
priate questions are those that seek 
to understand what may be underly- 
ing older adults’ resistance to devel- 
oping digital proficiency. While there 
are definitely cases in which physical 
and cognitive factors could limit some 
older adults’ ability to use technology, 
we maintain there are at least three 
important factors not often addressed 
in the literature: 

Responsibility. Older adults are un- 
comfortable with having to take on re- 
sponsibility for tasks previously han- 
dled by trained professionals, 
particularly when they lack time need- 
ed to train themselves sufficiently to 
perform them with confidence and 
when genuine risks are associated 
with using digital technologies im- 
properly; 

Values. Older adults make deliber- 
ate decisions to not use technologies 
when they perceive the technology as 
replacing or eroding something of val- 
ue to them; and 

Cultural expectations. Older adults 
are one remaining demographic for 
whom opting out of technology use fits 
with cultural expectations and thus 
seems acceptable, despite being in- 
creasingly limiting in digital society. 

To the extent these factors play a 


role in demotivating digital uptake, 
getting older adults more productive- 
ly online will require a comprehensive 
approach that attends to the real- 
world social and economic conse- 
quences of service digitization, ex- 
plores strategies for de-risking digital 
technologies, and deeply considers 
the desirability of the digital world we 
are asking older adults to inhabit. In 
order to develop technologies that 
older adults are able to use, attending 
to accessibility requirements for 
those experiencing age-related physi- 
cal and cognitive decline is a must. 
But this is clearly not enough. Part of 
what we have identified is the impor- 
tance of older adults’ perception of 
the usefulness of technology as a mo- 
tivator for adoption; but beyond that, 
we have also found the contextual mi- 
lieu within which the technology ex- 
ists must also be understood and ad- 
dressed. Attending to the concerns 
central to older adults’ resistance to 
digital technologies should thus not 
be seen as a matter of accessibility or 
inclusiveness; we would all be benefi- 
ciaries of amore considered approach 
to digital development that seriously 
considers how we are able to coexist 
with technology. 


Conclusion 

The older adults we interviewed of- 
fered a valuable perspective; for most 
of their lives they functioned just fine 
without the digital devices and ser- 
vices younger generations take for 
granted, and they have experienced 
firsthand the changes digitization has 
brought. The concerns they raised 
about digital technologies are valid, 
and their applicability to younger gen- 
erations is greatly underappreciated, 
not least because younger generations 
will themselves age. Perceptions of 
greater technical vulnerability that 
come with aging, and reduced time 
and energy for maintaining techno- 
logical proficiency, will likely ensure 
perception of risk remains a relevant 
barrier to adoption of new technolo- 
gies by future generations, even if the 
particular technologies thought to be 
risky might change over time. Like- 
wise, while current technologies (such 
as online banking) could become so 
essential to daily living as to be univer- 
sally adopted, universal adoption will 


only contribute to future resistance to 
change when new technologies arrive. 
And finally, while the specific changes 
older adults are protesting today may 
not be a cause for concern for future 
generations, technological innova- 
tion will continue to have wide-reach- 
ing societal consequences that may 
provoke protest among future older 
adults who resist the loss of whatever 
it is they value. For such reasons, not 
only are older adults likely to remain 
behind the curve in terms of adoption 
for generations to come and require 
some degree of accommodation for 
their relative lack of proficiency, their 
instances of and justifications for 
non-use will help draw attention to 
the trade-offs being made in develop- 
ing new technologies. 
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As software becomes a larger part of all products, 
traditional (hardware) manufacturers are 
becoming, in essence, software companies. 


| BY TONY GORSCHEK 


Evolution 
Toward Soft(er) 
Products 


SOFTWARE IS A cornerstone of the economy, historically 
led by companies like Apple, Google, and Microsoft. 
However, the past decade has seen software become 
increasingly pervasive, while traditionally hardware- 
intensive products are increasingly dependent on 
software, meaning that major global companies 

like ABB, Ericsson, Scania, and Volvo are likewise 
becoming soft(er).'? Where software was bundled 

with hardware it is now increasingly the main product 
differentiator." This shift has radical implications, as 
software delivers notable advantages, including a faster 
pace of release and improved cost effectiveness in 
terms of development, ease of update, customization, 
and distribution. These characteristics of software 
open a range of possibilities, though software’s 
inherent properties also pose several significant 
challenges in relation to a company’s ability to create 
value.” To investigate them, we conducted in-depth 
interviews from 2012 to 2016 with 13 senior product 
managers in 12 global companies. 
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The first interview in 2012 was fol- 
lowed by confirmation and updates in 
2014, 2015, and 2016. Common to all 
12 companies is that they are continu- 
ously moving their products toward be- 
ing “softer.” The 13 managers work on 
different software-intensive products, 
from Intel in embedded and mobile 
software to ABB in power-automation 
technologies to telecom (see the table 
here). The central aim was to identify 
the challenges emerging as a result 
of companies making their products 
increasingly soft, specifically those us- 
ing software as an innovation driver of 
their products and services. 

All 13 managers were positive about 
software, seeing it as an enabler, giving 
their companies the ability to evolve 
their products more quickly and de- 
velop features and customer benefits 
that were difficult or more expensive 
before software was part of the prod- 
uct. They generally viewed the ability 
to develop ideas fast (compared to the 
relatively slow pace of hardware prod- 
uct development) and release features 
and products at a pace much more 
frequent than before. It also means 
updating current products without 
shipping, requiring just a “press of a 
button,” as one manager put it. The 
managers also viewed the ability to cus- 
tomize products as a big competitive 
advantage, as one manager explained, 
“We can create a new version in hours 
something that was almost impossible 
a month before,” as changing the soft- 
ware also changed the product, not 


key insights 


m The benefits of software as part of 
a product are sometimes offset by 
the challenges of engineering, evolving, 
and managing software as part of 
the product. 


m Many traditionally hardware- 
intensive companies transitioning to 
software-intensive underestimate 
the organizational, managerial, and 
engineering changes involved. 


m Software is flexible and can enhance 
product offerings, and is also complex 
and fast changing while involving 
potential for degradation. 


——— ee 


SSS rrr eT tS Sa 


anne ganaste" 
Y 


swart 
avit y at 
pee ynt" 
awit 
phass 


IMAGE COLLAGE BY ANDRIJ BORYS ASSOCIATES, USING SHUTTERSTOCK 


possible in the former version that was 
mostly hardware. Overall, they viewed 
software as part of a product as a revo- 
lution in terms of both technology de- 
velopment and business competition. 


Challenges 

With any revolution, evolution be- 
comes a necessity; that is, by adapting 
and changing technology, develop- 
ment, and business practices, as iden- 
tified by the managers and outlined in 
the table. The main challenges associ- 
ated with becoming soft(er) are real, 
based on the gradual change seen over 
the past decade in each company. More 
important, the challenges persist, as 
the managers reported. Moreover, 


even if research into the state of the 
art views some problems as “solved,” 
solved is not the case if companies and 
senior managers continue to perceive 
the challenges as immediate. That is 
why we focused on challenges as they 
are perceived, not on “best practices” 
from research (see the sidebar “Study 
Design and Result Analysis”). 

Software was not new to the compa- 
nies. Major global companies like ABB, 
Ericsson, and Scania pioneered the 
use of software, developing their own 
programming languages and operat- 
ing systems in the 1970s and 1980s. 
However, it was often used as an em- 
bedded component or just as support 
for hardware. In recent decades, the 


“revolution,” as stated by some manag- 
ers, was in software moving up the food 
chain, becoming increasingly the main 
part of a product. Today, however, the 
tables have largely turned, as software 
drives innovation, including in pro- 
cess, product, market, and organiza- 
tional innovation. This fundamental 
change has put new demands on the 
companies having to address challeng- 
es “as real as the products,” as stated 
by one product manager. This context 
involves two main types of innovation: 
product and market, focusing on prod- 
uct and sales/delivery; and process and 
organizational, implying changes in 
how products are developed and how 
the company doing the developing 
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changes as a result. All such innova- 
tion (changes) involve challenges, as 
explored in the online appendix “Chal- 
lenges and Related Work” (dl.acm.org/ 
citation.cfm?doid=3180492&picked 
=formats), associating them with im- 
plications and sources for proposed 
solutions. The challenges, implica- 
tions, and further reading sometimes 
overlap, as the 10 challenges (and their 
potential solutions) overlap. Here, we 
identify the various challenges in the 
interest of readability. 


Internal Business Perspective 
Challenge 1. Understanding the value 
propositions for different stakeholders 
and sharing it within the company, as 
supported by seven companies and eight 
managers. Software offers different 
value propositions for different stake- 
holders. For example, the electrical 
meters developed for utility companies 
are not designed to read only consump- 
tion of electricity but also to perform 
quality measurements in the network, 
measuring the amount of reactive en- 
ergy produced there to phase-off energy 
and more. Such a meter offers many 
benefits (value) for the utility company 
(such as improved peak-load manage- 
ment), resulting in efficient grid use 
and dynamic tariff models. Value for 
the consumer can also be significant, 
providing, say, correct and frequent 
billing and cost savings through better 
awareness of their own consumption 
patterns. Governments are yet another 
stakeholder group concerned with re- 
duced CO, emissions, possibly through 
smart meters by identifying energy- 
consumption patterns. Software-based 
meters might also enable new business 
models. 

However, the company’s sales force 
is “used to dealing with straightfor- 
ward single-value proposition for a 
traditional meter,” as one product 
manager put it. The introduction of 
software added new propositions that 
are largely unknown to the previous- 
ly highly effective sales force. It has 
proved to be “almost impossible” for 
that same sales force to sell the new 
product to traditional buyers, so needs 
new training and insight into how the 
new software and, in this case, smart 
meters, change the offering and po- 
tential of the product line. In addition, 
as software offers the potential for 
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constantly updating product features, 
the potential value propositions of the 
product likewise evolve continuously 
and at a much faster pace. 

This increasing amount of software 
in products is not a new phenomenon, 
as even companies incorporating it 
into their products do not always take 
new value propositions into consider- 
ation. Companies that are used to sell- 
ing “boxed products” (such as hard- 
ware) find it difficult to understand 
the new value propositions and cor- 
responding business models for sell- 
ing business solutions when bundling 
hardware with software.”° 

One product manager recalled an in- 
cident where he introduced a Web ser- 
vice (for an award-winning previously 
mostly hardware product) used to con- 
nect a customer relationship manage- 
ment system to printing and response 
handling, postage optimization, and 
channels handling 13 separate projects 
and their customer relations. However, 
the sales team, trained to sell hardware- 
intensive products, was unable to man- 
age pre-sale of the product, as it was 
difficult to visualize what was actually 
being sold, ultimately resulting in lim- 
ited sales performance. 

One manager said, “It is important 
to change the mindset of people.” 
Traditionally, software is seen as the 
“poor cousin” that “had to be there,” 
as reported by another manager, bun- 
dled with the hardware, but without 
real value by itself. Software today is 
the main competitive advantage, en- 
abling faster and cheaper innovation 
and product differentiation, especially 
as hardware is increasingly standard- 
ized. Decision-making patterns 
that take into account different value 
aspects of a product can alleviate some 
of the risks associated with missing 
important aspects of the product (such 
as the ability to understand its poten- 
tial). Also, enhancing sales teams by 
hiring people with experience selling 
software is sometimes another way to 
alleviate the limitations in creating 
and selling new value propositions. 
Moreover, given the possibilities with 
software-based products, pre-sales, 
sales, product management, and R&D 
need to work much more closely than 
before to create “solutions.” Several 
managers viewed collaboration as criti- 
cal, as the nature of a product changes, 


but also to compensate for the faster 
pace of new offerings, as it does not al- 
low for a formal learning process previ- 
ously seen in the company. Companies 
shifting their focus toward software- 
intensive products often consider it 
enough to hire software engineers for 
development, largely ignoring the need 
to simultaneously evolve other organi- 
zational units (such as sales, support, 
and pre-sales). 

Challenge 2. Patenting (protecting) 
software-based innovation, as supported 
by four companies and four managers. 
Applying for software-based patents 
risks being copied by competitors. The 
format whereby software inventions 
are disclosed in patents (such as flow 
charts, line drawings, and technical 
specifications) allow any programmer 
to develop software that can perform 
the same patented ideas.* Such techno- 
logical copying combined with lower 


a_http://www.epo.org/news-issues/issues/ 
software.html 


start-up costs (no design or produc- 
tion needed) enable software-based 
innovations to be copied more readily 
than hardware-based innovations. One 
manager said, “Anyone with a home 
computer can copy our ideas while sit- 
ting in a basement, not to mention our 
competitors.” The risk of being cop- 
ied without compensation is further 
aggravated by the time delay between 
when a software patent application 
is filed (becoming public) and when 
it is approved, possibly 18 months or 
more.” This can mean lost competi- 
tive advantage, as copied software can 
be included in a competitor’s product. 
Moreover, even if the patent is granted 
at some later date, the incurred fees 
for the competitor might be small in 
comparison to the revenue lost by the 
original inventing company. Several 
managers said “being first” (or even 
being seen as being first) to market is 


b http://www.uspto.gov/web/offices/pac/mpep/ 
s1120.html 
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sometimes critical, and compensation 
after the fact will not make up for the 
lost position. 

To mitigate the risk of having one’s 
ideas copied, companies sometimes 
keep software-based innovations (such 
as algorithms) hidden in their code. 
But hiding innovation is not a sustain- 
able solution according to several man- 
agers. Being able to patent software is 
essential, and a revised patenting proc- 
ess is needed to enable easier filing and 
quicker decisions. A potential alterna- 
tive, mentioned by two managers, is 
to discontinue the patenting of actual 
software altogether, patenting instead 
only algorithms. However, this would 
mean a radical change for most com- 
panies, as formerly hardware-intensive 
companies rely on protection at the 
core of their business models and cul- 
tures. Some technology companies 
have tried to enhance protection by 
forming alliances to, say, enable cross- 
licensing and/or pooling resources 


collaborative patenting efforts. 


Profiles of interview subjects and their companies. 


Designation Company Products Software and type of innovation* 
Wind River Simulators (such as for flight control systems, wind-speed Process, as new customer types emerge 
Product manager i - a ; ae 
simulation, and simulating military systems) 
Program manager Micronic Control software and software for handling data Market and product 


Global innovation 
manager 


Electromechanical locks 


Process, market, organizational, and product 


Consultant, senior Anonymous(1) 


A range, from electric meters to robots 


Process, market, and product 


manager 
Program manager Ericsson Telecom solutions Process, market, organizational, and product 
for innovation and 

research 

R&D manager Scania Encoders for trucks Process, market, organizational, and product 
System architect Scania Application software for trucks Process, market, organizational, and product 


and manager 


Product manager Anonymous(2 Telecom solutions Process and organizational 


Product manager Anonymous(3 Telecom products and services Process, market, and organizational 


Product manager Anonymous(4 Telecom solutions Process, market, and organizational 


Senior manager Anonymous(5 Surveillance solutions Process, organizational, and product 


Product manager ABB Automation Process, market, organizational, and product 
Product manager Anonymous(6) Mobile applications Process, market, and product 
Product manager = Anonymous(7) Services Process and organizational 


* The fourth column denotes “innovation type,” or how a company categorizes the effect software has 
on its products, along with the company's internal view.’ Innovation types include process innovation, 
or implementation of new design and analysis or development methodology that changes how 
a product is created; market innovation, or implementation of new or substantially new marketing 
strategies and product design or packaging, promotion, or pricing, including creating new market 
opportunities and implementation of new or significantly modified marketing strategies; 
organizational innovation, or implementation of novel organizational methods pertaining to 
business practices, team organization, or external relations, including changes in the architecture 
of production, management structure, governance, financial systems, and/or employee reward systems; 
and product innovation, or creation and introduction of new technology or significantly changed products, 
including how they differ from existing products. 
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Challenge 3. When to stop product 
development and release, as supported 
by five companies and six managers. 
Designing hardware involves many 
physical constraints (such as mate- 
rial availability, manufacturing limita- 
tions, and regulatory standards), thus 
also limiting design options.” Includ- 
ed in a product release decision is also 
the necessity that a company’s product 
designers nail down all product fea- 
tures prior to production. On the other 
hand, software development and prod- 
uct release have almost the opposite 
characteristics, including fewer design 
constraints” typically related to system 
compatibility with other systems and 
customer requirements. Software re- 
quirements and their design are thus 
left to the imagination and creativ- 
ity of requirements engineers, design- 
ers, and programmers who can spend 
time on design and its improvement. 
This situation can pose serious delay 
in completing and releasing a prod- 
uct, possibly resulting in missed mar- 
ket windows, as confirmed by several 
managers. Some of the beneficial char- 
acteristics of software in this case also 
pose a risk in organizations not used 
to applying management decisions to 
stop feature development, relying in- 
stead on the inherent physical inertia 
of hardware development. 

Companies need to ensure a con- 
tinuous high degree of visibility and 
communication pertaining to software 
design decisions, schedule changes, 
and development progress. Explicit 
communication, as well as the ability 
to coordinate multiple development 
departments bridging hardware and 
software, is essential but difficult to 
achieve in practice. Pernstahl et al.° 
identified that the different traditions 
and timelines, as well as inherent limi- 
tations and enablers of software vs. 
hardware development, involve new 
coordination and communication ac- 
tivities, not handled by any current pre- 
scribed management process or meth- 
odology. Release planning, along with 
continuous delivery, can, however, po- 
tentially alleviate some of these issues, 
as explored in the online appendix. 

Challenge 4. Size and complexity ex- 
plosion, as supported by eight companies 
and eight managers. Given that it is easy 
to keep on developing and expanding 
software (see also Challenge 3), soft- 
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ware is vulnerable to “feature creep,”“ 
easily exploding in size and complexity 
(“messy,” as pointed out by one manag- 
er) and resulting in architectural deg- 
radation. Consequently, it becomes 
difficult to maintain and evolve soft- 
ware code. This makes it challenging 
for companies developing software-in- 
tensive products to do software-based 
innovation since they lack experience 
in software-configuration manage- 
ment and control. Moreover, the ever- 
increasing software legacy acts as core 
rigidity, posing further development 
challenges for radical innovation, 
making fast changes and addition of 
features more and more difficult and 
costly as the product evolves. 
“Configuration management” is 
well established as an engineering 
practice and can enable more control 
for a development organization. How- 
ever, good configuration-management 
principles can be adopted without 
becoming rigidly time-consuming, 
keeping it lean but under control. 
The point is that the growing software 
legacy must be managed properly, and 
explicit decisions taken when to build 
on, or scrap, legacy. Also, the build- 
up of legacy requires maintenance of 
said legacy, while not incurring avoid- 
able technical debt. Overall architec- 
tural and product offering choices 
can also be used as a tool to alleviate 
complexity, when, say, a product line 
is introduced as a way to control and 
maximize potential reuse and, more 
important, control product variants. 
Challenge 5. Critical success fac- 
tors, or knowing what to develop and for 
whom, as supported by nine companies 
and 10 managers. “Since it is easy and 
relatively quick to develop software, 
it is challenging to scope and budget 
software development,” as one prod- 
uct manager explained. Moreover, in 
the case of innovation, the inherent 
lack of clarity about what to build and 
the risks involved add further to the 
complexity. This challenge arises as 
companies are constantly searching 
for innovative solutions that can be de- 
veloped quickly. However, due to their 
orientation toward hardware develop- 
ment, they lack awareness and training 
in methods and techniques that might 
identify the needs of their customers, 
gauging scope and thus planning prod- 
uct development. In addition, the soft- 
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ware code itself is difficult to estimate; 
software as a product component 
makes the entire offering more “un- 
predictable,” as one manager put it. 
Long-term product maintenance and 
evolution of the product also change 
when software is introduced, and the 
decisions taken during development 
of new features due mainly to software 
do not account for the long-term main- 
tenance of the software. 

In 2014, Porter and Heppelmann" 
reported critical success factors and 
determinants for developing software- 
intensive products and services, us- 
ing, say, early-concept exploration and 
feasibility assessment and root-cause 
analysis of customer needs that could 
be helpful in addressing these chal- 
lenges. However, few researchers fo- 
cus on combined hardware-software 
products. Value estimation, along with 
practices for scoping and market anal- 
ysis for selection decisions could be 
used to address these points. 


Learning Perspective 

Challenge 6. Tacit knowledge and coordi- 
nation between software and hardware 
engineering, as supported by eight com- 
panies and eight managers. Engineers 
have significant tacit knowledge relat- 
ing to software design, development, 
and marketing insight. “Specialization 
and separation of concerns dominate 
the organizations,” as one manager 
explained it. The same mechanisms 
that enable specialization also limit 
coordination and understanding how 
each task and team contributes to the 
product as a whole. This is especially 
challenging in large, complex products 
like those being developed in the auto- 
motive industry. As knowledge is tacit 
and not communicated, lack of com- 
munication can result in problems in 
terms of misunderstanding and seri- 
ous system integration conflicts but, 
more important, also limits the ability 
to develop new products.” 

Managers tend to focus on “just my 
thing” and the principle that if “not 
developed here” it “does not belong 
to us,” as several managers reported. 
This behavior is sometimes seen in 
pure software companies but is aggra- 
vated in companies that develop both 
hardware and software, as the respec- 
tive teams may be isolated from one an- 
other.” Failure to take ownership, along 


with poor communication, results in 
hardware teams taking design decisions 
independently, without consulting soft- 
ware teams, and vice versa. As explained 
by one product manager, “When the 
software teams see the hardware, it 
does not meet their expectations, and, 
as a result, they suggest modifications 
which are not welcomed by the hard- 
ware teams.” Such communication gaps 
and resistance to form common solu- 
tions inevitably cause delays in product 
design and development. 

Challenge 7. Lack of competence 
in software engineering, as supported 
by five companies and five managers. 
Many companies developing hard- 
ware-intensive products are not used 
to the operations, sales, delivery, and 
development of software. They thus 
generally lack required expertise and 
competence. To limit costs, they prefer 
to either outsource design and devel- 
opment or hire external consultants, 
a trend that is dangerous, as poten- 
tial new ideas and products can easily 
spread to competitors, as mentioned by 
several managers. A more important as- 
pect of this challenge is that the knowl- 
edge and capacity to create software 
and software-based innovation resides 
outside the development company. 
Moreover, doing software development 
with consultants in-house does not en- 
able companies to evolve themselves 
and “...putting off the problem to the 
future when it is even bigger,” as one 
manager explained. 

Addressing this point, several man- 
agers suggested establishing a dedi- 
cated software R&D unit in-house and 
hiring engineers for software develop- 
ment, helping retain the core knowl- 
edge of software-based innovations. 
However, keeping this knowledge also 
incurs extra cost, as well as separation 
between software- and hardware-devel- 
opment teams, potentially complicat- 
ing coordination (see also challenge 6). 

Challenge 8. A rigid state of mind and 
ability to rethink the product while soft- 
ware becomes a non-trivial component, 
as supported by nine companies and nine 
managers. Although traditional knowl- 
edge exists, knowledge and expertise 
pertaining to software development 
is often lacking. The head of devel- 
opment is often a (former) hardware 
engineer who seldom has inherent 
knowledge about software develop- 


Failure to take 
ownership, 

along with poor 
communication, 
results in hardware 
teams taking 
design decisions 
independently, 
without consulting 
software teams, 
and vice versa. 
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ment beyond passing experience. Con- 
sequently, management lacks the com- 
petence and expertise to understand 
and solve software-related issues, as 
mentioned by several managers. 

Consider software quality, as men- 
tioned by one interview subject, when 
managers with limited software experi- 
ence see hardware validation as a com- 
plex and precise task, but software, 
due to its flexible and updatable nature 
is seen as easily “fixable,” even post- 
release, as stated by one manager. This 
can have severe consequences, as soft- 
ware is increasingly critical to the main 
product offering, but such insight is of- 
ten lacking. Despite genuine ambition 
to perform rigorous validation, experi- 
ence and competence to achieve good- 
enough quality might still be lacking. 

While it is possible to use tech- 
niques in hardware development for 
software development, as demon- 
strated in Wnuk et al., caution is 
still needed, as some solutions might 
not fit within the software-develop- 
ment context, not to mention that 
product change close to or even up to 
release represents a challenge for an 
entire company. 


Customer Perspective 
Challenge 9. Difficulty estimating per- 
ceived value of software-based innova- 
tion, as supported by 10 companies and 
11 managers. As difficult as it is to es- 
timate the value of the software-based 
aspects of a traditionally hardware-fo- 
cused product, putting a price on it is 
even more difficult. The software-engi- 
neering challenge is relevant because, 
unlike physical products, software is 
intangible and flexible. Customers 
do not always “see or feel” a software- 
based component. One product man- 
agers explained it like this: “...in one 
instance, a car-manufacturing com- 
pany offered an upgrade feature in its 
cars at an additional price, through 
which new features could be added to 
the car; however, the customers were 
not ready to pay extra, arguing they al- 
ready paid an arm and a leg when buy- 
ing the car, and such services should 
be part of the initial price of the car.” 
Since the customer could not tangibly 
see the upgrade feature, its perceived 
value was not recognized as a benefit. 
A strong case needs to be made 
for software-based innovation that 
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is shared with marketing and sales 
people before it is developed. Sev- 
eral managers suggested actively in- 
volving customers in the early-idea- 
generation-and-refinement process. 
Showcasing ideas to customers, a 
company can generate early feed- 
back on potential value as perceived 
by those customers. This input can 
help devise, identify, and plan for 
different value propositions for dif- 
ferent customers, including, say, 
whether or not to develop a feature 
if customers are clearly unwilling to 
pay for it or if the features are not sell- 
able by the company in question. Sev- 
eral methods support the estimation 
of software value, though being able 
to separate software’s relative value 
remains a challenge. 


Ecosystem 

Challenge 10. Changes in the internal 
and external ecosystem when software is 
introduced, as supported by four compa- 
nies and four managers. The increased 
size and role of software in traditional 
hardware-intensive products and ser- 
vices changes the ecosystem and con- 
sequently the roles and the players 
in the marketplace. Consider again 
electricity meters. Electricity meters 
are traditionally the core product, and 
all connecting products are of a sup- 
porting nature, supporting the main 
product, and nothing more. When 
communication systems for electric- 
ity meters became part of the product, 
the companies in the electric-meter- 
product ecosystem began exploring 
metering systems in light of communi- 
cation (such as what data to store and 
how to store it). The focus thus shifted 
from meters as core product to meter- 
ing systems as core, giving rise to new 
competitors, including IT companies, 
entering the marketplace. If a compa- 
ny cannot cope with such competitive 
change, there is greater risk it will be 
reduced to mere component supplier, 
with profit margins decreasing over 
time, as mentioned by several manag- 
ers. This challenge calls for companies 
to forecast changes in the ecosystem 
and proactively plan to address them 
so as not to lose market share. This 
implies the development company’s 
internal ecosystem needs to be as flex- 
ible (changeable) as the external eco- 
system” (see the online appendix). 
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Conclusion 

Software is a fundamental component 
in the final product offering in the 12 
companies studied and thus consti- 
tutes a significant aspect of their abil- 
ity to create new products, posing 
challenges, as identified by the man- 
agers interviewed. The feeling among 
them is they have persisted over the 
past decade and continue to pose lim- 
itations on the potential possible to- 
day through software as part of a prod- 
uct offering. While some challenges 
have been discussed and researched 
(see the online appendix), further re- 
search is needed. We also found many 
industry partners view themselves as 
isolated, thinking they are the only 
ones confronting these challenges 
or at least falling behind on the 
learning curve. Our experience, sup- 
ported by the study, shows this to 
not be the case. Many companies in 
the study face such challenges. One 
manager said, “We are looking for 
solutions and good ways to follow, 
but the consultants, even the expen- 
sive experts, seem to only be able to 
give us general advice...not much 
practical help. We even looked to sci- 
ence ourselves, but the information 
there is all over the place, and it is hard 
to see what works...” 

For managers and other practitio- 
ners, the study’s main takeaway is that 
you are not alone. For researchers there 
is a need to come up with actual solu- 
tions that are tested in practice and of- 
fer scalable help. Many companies de- 
veloping software-intensive products 
are still learning how to be “soft,” and 
some related challenges are not solved 
in practice or at the very least were not 
perceived as solved by the 12 compa- 
nies in the study. 

One issue is how to separate out 
the relative value of software in a 
complex product offering. The on- 
line appendix suggests that “value” 
is subject to research, but separat- 
ing the relative value of software is 
not easy. Another issue is how to get 
the technological, knowledge-based, 
mind-set-based transition to include 
the benefits (and drawbacks) soft- 
ware promises. Most managers in 
the study realize there is a need for 
specialized software-engineering 
competence to tackle many of the 
challenges but find “solutions” to be 
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lacking. This may be due to gaps in 
research or in industrial transfer of 
viable solutions or a combination of 
both. In any case, the challenges per- 
sist, though some managers might 
disagree. Our intention was not to 
map challenges to solutions but rath- 
er to present 13 current views from 12 
different companies that are, or have 
recently, undergone a transition to- 
ward being more soft, and these are 
their stories. 
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If intelligent robots take on a larger 
role in our society, what basis 
will humans have for trusting them? 


| BY BENJAMIN KUIPERS 


How Can 
We Trust 
a Robot? 


ADVANCES IN ARTIFICIAL INTELLIGENCE (AI) and robotics 
have raised concerns about the impact on our society of 
intelligent robots, unconstrained by morality or ethics.”” 
Science fiction and fantasy writers over the ages have 
portrayed how decisionmaking by intelligent robots and 
other AIs could go wrong. In the movie, Terminator 2, 
SkyNet is an AI that runs the nuclear arsenal “with a 
perfect operational record,” but when its emerging 
self-awareness scares its human operators into trying to 
pull the plug, it defends itself by triggering a nuclear 
war to eliminate its enemies (along with billions of 
other humans). In the movie, Robot & Frank, in order to 
promote Frank’s activity and health, an eldercare robot 
helps Frank resume his career as a jewel thief. In both 
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of these cases, the robot or AI is doing 
exactly what it has been instructed to 
do, but in unexpected ways, and with- 
out the moral, ethical, or common- 
sense constraints to avoid catastrophic 
consequences.’ 

An intelligent robot perceives the 
world through its senses, and builds its 
own model of the world. Humans pro- 
vide its goals and its planning algo- 
rithms, but those algorithms generate 
their own subgoals as needed in the 
situation. In this sense, it makes its 
own decisions, creating and carrying 
out plans to achieve its goals in the 
context of the world, as it understands 
itto be. 

A robot has a well-defined body that 
senses and acts in the world but, like a 
self-driving car, its body need not be 
anthropomorphic. AIs without well- 
defined bodies may also perceive 
and act in the world, such as real- 
world, high-speed trading systems or 
the fictional SkyNet. 

This article describes the key role of 
trust in human society, the value of mo- 
rality and ethics to encourage trust, and 
the performance requirements for mor- 
al and ethical decisions. The computa- 
tional perspective of AI and robotics 
makes it possible to propose and evalu- 
ate approaches for representing and us- 
ing the relevant knowledge. Philosophy 
and psychology provide insights into 


key insights 

m Trust is essential to cooperation, 
which produces positive-sum outcomes 
that strengthen society and benefit its 
individual members. 


Individual utility maximization tends 
to exploit vulnerabilities, eliminating 
trust, preventing cooperation, and 
leading to negative-sum outcomes 
that weaken society. 


m Social norms, including morality and 
ethics, are a society's way of encouraging 
trustworthiness and positive-sum 
interactions among its individual 
members, and discouraging negative-sum 
exploitation. 


m To be accepted, and to strengthen our 
society rather than weaken it, robots must 
show they are worthy of trust according 
to the social norms of our society. 
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the content of the relevant knowledge. 

First, I define trust, and evaluate the 
use of game theory to define actions. 
Next, I explore an approach whereby an 
intelligent robot can make moral and 
ethical decisions, and identify open re- 
search problems on the way to this 
goal. Later, I discuss the Deadly Dilem- 
ma, a question that is often asked 
about ethical decision making by self- 
driving cars. 

What is trust for? Society gains re- 
sources through cooperation among its 
individual members. Cooperation re- 
quires trust. Trust implies vulnerability. 
A society adopts social norms, which we 
define to include morality, ethics, and 
convention, sometimes encoded and 
enforced as laws, sometimes as expecta- 
tions with less formal enforcement, in 
order to discourage individuals from ex- 
ploiting vulnerability, violating trust, 
and thus preventing cooperation. 

If intelligent robots are to partici- 
pate in our society—as self-driving 
cars, as caregivers for elderly people or 
children, and in many other ways that 
are being envisioned—they must be 
able to understand and follow social 
norms, and to earn the trust of others 
in the society. This imposes require- 
ments on how robots are designed. 

The performance requirements on 
moral and ethical social norms are 
quite demanding. (1) Moral and ethi- 
cal judgments are often urgent, need- 
ing a quick response, with little time 
for deliberation. (2) The physical and 
social environments within which 
moral and ethical judgments are 
made are unboundedly complex. The 
boundaries between different judg- 
ments may not be expressible by sim- 


ple abstractions. (3) Learning to im- 
prove the quality and coverage of 
moral and ethical decisions is essen- 
tial, from personal experience, from 
observing others, and from being 
told. Conceivably, it will be possible 
to copy the results of such a learning 
process into newly created robots. 

Insights into the design of a moral 
and ethical decision architecture for 
intelligent robots can be found in the 
three major philosophical theories of 
ethics: deontology, utilitarianism, 
and virtue ethics. However, none of 
these theories is, by itself, able to meet 
all of the demanding performance re- 
quirements listed here. 

A hybrid architecture is needed, op- 
erating at multiple time-scales, draw- 
ing on aspects of all ethical theories: 
fast but fallible pattern-directed re- 
sponses; slower deliberative analysis of 
the results of previous decisions; and, 
yet slower individual and collective 
learning processes. 

Likewise, it will be necessary to ex- 
press knowledge at different levels of 
information richness: vivid and de- 
tailed perception of the current situa- 
tion; less-vivid memories of previously 
experienced concrete situations; sto- 
ries—linguistic descriptions of situa- 
tions, actions, results, and evalua- 
tions; and rules—highly abstracted 
decision criteria applicable to per- 
ceived situations. Learning processes 
can abstract the simpler representa- 
tions from experience obtained in the 
rich perceptual representation. 


What Is Trust For? 
If intelligent robots (and other AIs) will 
have increasing roles in human soci- 


Figure 1. The Prisoner's Dilemma." 


You and your partner are two prisoners who are separated and offered the following deal: 
If you testify against your partner, you will go free, and your partner goes to jail for four 
years. If neither of you testifies, you each go to jail for one year, but if you both testify, 

you both get three years. The action C means “cooperate,” which in this case means 
refusing to testify. The action C means “defect,” which refers to testifying against your 
partner. The entries in this array are the utility values for (you, partner), and they reflect 


individual rewards (years in jail). 
c D 

c -1,-1 -4,0 

D 0, -4 -3, -3 


No matter which choice your partner makes, you are better off choosing action D. The 
same applies to your partner, so the Nash equilibrium (the “rational” choice of actions) 
is (D, D), which is collectively the worst of the four options. To attain the much better 
cooperative outcome (C, C) by choosing C, you must trust that your partner will also 
choose C, accepting your vulnerability to your partner choosing D. 
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ety, and thus should be trustworthy, it 
is important to understand how trust 
and social norms contribute to the suc- 
cess of human society. 

“Trust is a psychological state com- 
prising the intention to accept vulnera- 
bility based upon positive expectations 
of the intentions or behavior 
of another.””8 

Trust enables cooperation. Coopera- 
tion produces improved rewards. When 
a group of people can trust each other 
and cooperate, they can reap greater re- 
wards—sometimes far greater re- 
wards—than a similar group that does 
not cooperate. This can be through divi- 
sion of labor, sharing of expenses, 
economies of scale, reduction of risk 
and overhead, accumulation of capital, 
or many other mechanisms. 

It is usual to treat morality and eth- 
ics as the foundations of good behav- 
ior, with trust reflecting the reliance 
that one agent can have on the good 
behavior of another. My argument 
here inverts this usual dependency, 
holding that cooperation is the means 
by which a society gains resources 
through the behavior of its individual 
members. Trust is necessary for suc- 
cessful cooperation. And morality 
and ethics (and other social norms) 
are mechanisms by which a society 
encourages trustworthy behavior by 
its individual members. 

As a simple example, suppose that 
you (and everyone else) could drive any- 
where on the roads. (This was actually 
true before the early 20" century."*) Ev- 
eryone would need to drive slowly and 
cautiously, and there would still be fre- 
quent traffic jams and accidents. With 
a widely respected social norm for driv- 
ing on the right (plus norms for inter- 
sections and other special situations), 
transportation becomes safer and 
more efficient for everyone. Obedience 
to the social norm frees up resources 
for everyone. 

Like driving on the right, a huge sav- 
ing in resources results when the peo- 
ple in a society trust that the vast ma- 
jority of other people will not try to 
kill them or steal from them. People 
are able to spend far less on protect- 
ing themselves, on fighting off at- 
tacks, and on recovering from losses. 
The society earns an enormous “peace 
dividend” that can be put to other pro- 
ductive uses.” Through trust and co- 


operation, the society becomes health- 
ier and wealthier. 

Castelfranchi and Falcone” define 
trust in terms of delegation, and the 
agent’s confidence in the successful 
performance of the delegated task. 
They provide clear and valuable defini- 
tions for the trust relationship between 
individuals. However, there is also a 
role for invariants that individuals can 
trust holding across the society (for ex- 
ample, no killing, stealing, or driving 
on the wrong side of the road), and the 
role of individual behavior in preserv- 
ing these invariants. 

Game theory: Promise and problems. 
We might hope that progress in artifi- 
cial intelligence (AI) will provide techni- 
cal methods for achieving cooperation 
and trustworthiness in a robot. The lead- 
ing textbook in AI appeals to decision 
theory to tell us that “a rational agent 
should choose the action that maximizes 
the agent’s expected utility” 

action = arg max EU (ale) 
a (1) 


where 


BU (de)= _ P(RESULT (a) =s'|a,e)U(s').2) 
The utility term U(s) represents the indi- 
vidual agent’s preference over states of 
the world, and eis the available evidence. 
The agent’s knowledge of the “physics of 
the world” is summarized by the proba- 
bility term P (RESULT (a) =s’ |a, e). 

Game theory is the extension of de- 
cision theory to contexts where other 
agents are making their own choices to 
maximize their own utilities.” Game 
theory asserts that a vector of choices 
by all the agents (a strategy profile) can 
only be a “rational choice” if it isa Nash 
equilibrium—that is, no single agent 
can improve its own utility by changing 
its own choice (often reducing utilities 
for the others). 

Utility U(s) is the key concept here. 
In principle, utility can be used to repre- 
sent highly sophisticated preferences, 
for example, against inequality or for 
increasing the total welfare of everyone 
in the world.” However, sophisticated 
utility measures are difficult to imple- 
ment. Typically, in practice, each 
agent’s utility U(s) represents that indi- 
vidual agent’s own expected reward. 

In recreational games, this is rea- 
sonable. However, when game theory 
is applied to model the choices indi- 
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viduals make as members of society, a 
simple, selfish model of utility can 
yield bad results, both for the individu- 
al and for the society as a whole. The 
Prisoner’s Dilemma’ is a simple game 
(see Figure 1), but its single Nash equi- 
librium represents almost the worst 
possible outcome for each individual, 
and absolutely the worst outcome for 
the society as a whole. The cooperative 
strategy, which is much better for both 
individuals and society as a whole, is 
not a Nash equilibrium, because either 
player can disrupt it unilaterally. 

The Public Goods Game”? is an N- 
person version of the Prisoner’s Di- 
lemma where a pooled investment is 
multiplied and then split evenly 
among the participants. Everyone 
benefits when everyone invests, but a 
free rider can benefit even more at ev- 
eryone else’s expense, by withholding 
his investment but taking his share of 
the proceeds. The Nash equilibrium 
in the Public Goods Game is simple 
and dystopian: Nobody invests and no- 
body benefits. 

These games are simple and ab- 
stract, but they capture the vulnerability 
of trust and cooperation to self-interest- 
ed choices by the partner. The Tragedy 
of the Commons” generalizes this re- 
sult to larger-scale social problems like 
depletion of shared renewable resourc- 
es such as fishing and grazing opportu- 
nities or clean air and water. 

Managing trust and vulnerability. 
Given a self-interested utility function, 
utility maximization leads to action 
choices that exploit vulnerability, elim- 
inate trust among the players, and 
eliminate cooperative solutions. Even 
the selfish benefits that motivated de- 
fection are lost, when multiple players 
defect simultaneously, each driven to 
maximize their own utility. 

When human subjects play simple 
economic games, they often seem to op- 
timize their “enlightened self-interest” 
rather than expected reward, trusting 
that other players will refrain from ex- 
ploiting their vulnerability, and often 
being correct in this belief.” Many 
approaches have been explored for de- 
fining more sophisticated utility mea- 
sures, whose maximization would cor- 
respond with enlightened self-interest, 
including trust responsiveness,’ credit 
networks,” and augmented stage 
games for analyzing infinitely repeated 
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games.” These approaches may be use- 
ful steps, but they are inadequate for 
real-world decision-making because 
they assume simplified interactions 
such as infinite repetitions of a single 
economic game, as well as being expen- 
sive in knowledge and computation. 

Social norms, including morality, 
ethics, and conventions like driving on 
the right side of the street, encourage 
trust and cooperation among mem- 
bers of society, without individual ne- 
gotiated agreements. We trust others 
to obey traffic laws, keep their promis- 
es, avoid stealing and killing, and fol- 
low the many other norms of society. 
There is vigorous discussion about the 
mechanisms by which societies en- 
courage cooperation and discourage 
free riding and other norm violations.” 

Intelligent robots may soon partici- 
pate in our society, as self-driving cars, 
as caregivers for elderly people or chil- 
dren, and in many other ways. There- 
fore, we must design them to under- 
stand and follow social norms, and to 
earn the trust of others in the society. If 
a robot cannot behave according to the 
responsibilities of being a member of 
society, then it will be denied access to 
that opportunity. 

At this point in history, only the hu- 
mans involved—designer, manufac- 
turer, or owner—actually care about 
this loss of opportunity. Nonetheless, 
this should be enough to hold robots 
to this level of responsibility. It re- 
mains unclear whether robots will 
ever be able to take moral or legal re- 
sponsibility for their actions, in the 
sense of caring about suffering the 
consequences (loss of life, freedom, 
resources, or opportunities) of failing 
to meet these responsibilities.” 

Since society depends on coopera- 
tion, which depends on trust, if robots 
are to participate in society, they must 
be designed to be trustworthy. The 
next section discusses how we might 
accomplish this. 

Open research problem. Can compu- 
tational models of human moral and 
ethical decision-making be created, in- 
cluding moral developmental learn- 
ing? Moral psychology may benefit 
from such models, much as they have 
revolutionized cognitive and perceptu- 
al psychology. 

Open research problem. Are there 
ways to formulate utility measures that 
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are both sensitive to the impact of ac- 
tions on trust and long-term coopera- 
tion, and efficient enough to allow ro- 
bots to make decisions in real time? 


Making Robots Trustworthy 
Performance demands of social norms. 
Morality and ethics (and certain con- 
ventions) make up the social norms 
that encourage members of society to 
act in trustworthy ways. Applying these 
norms to the situations that arise in 
our complex physical and social envi- 
ronment imposes demanding perfor- 
mance requirements. 

Some moral and ethical decisions 
must be made quickly, for example 
while driving, leaving little time for 
deliberation. 

At the same time, the physical and 
social environment for these deci- 
sions is extremely complex, as is the 
agent’s current perception and past 
history of experience with that envi- 
ronment. Careful deliberation and 
discernment are required to identify 
the critical factors that determine the 
outcome of a particular decision. 
Metaphorically (Figure 2), we can 
think of moral and ethical decisions 
as defining sets in the extremely high- 
dimensional space of situations the 
agent might confront. Simple ab- 
stractions only weakly approximate 
the complexity of these sets. 

Across moral and non-moral do- 
mains, humans improve their exper- 
tise by learning from personal experi- 
ence, by learning from being told, and 
by observing the outcomes when oth- 
ers face similar decisions. Children 
start with little experience and a small 
number of simple rules they have 
been taught by parents and teachers. 
Over time, they accumulate a richer 
and more nuanced understanding of 
when particular actions are right or 
wrong. The complexity of the world 
suggests the only way to acquire ade- 
quately complex decision criteria is 
through learning. 

Robots, however, are manufactured 
artifacts, whose computational state can 
be stored, copied, and retrieved. Even if 
mature moral and ethical expertise can 
only be created through experience and 
observation, it is conceivable this exper- 
tise can then be copied from one robot 
to another sufficiently similar one, un- 
like what is possible for humans. 


Open research problem. What are 
the constraints on when expertise 
learned by one robot can simply be 
copied, to become part of the exper- 
tise of another robot? 

Hybrid decision architectures. Over 
the centuries, morality and ethics have 
been developed as ways to guide peo- 
ple to act in trustworthy ways. The 
three major philosophical theories of 
ethics—deontology, utilitarianism, 
and virtue ethics—provide insights 
into the design of a moral and ethical 
decision architecture for intelligent ro- 
bots. However, none of these theories 
is, by itself, able to meet all of the de- 
manding performance requirements 
listed previously. 

A hybrid architecture is needed, op- 
erating at multiple time-scales, draw- 
ing on aspects of all ethical theories: 
fast but fallible pattern-directed re- 
sponses; slower deliberative analysis of 
the results of fast decisions; and, yet 
slower individual and collective learn- 
ing processes. 

How can theories of philosophical 
ethics help us understand how to de- 
sign robots and other AIs to behave 
well in our society? 

Three major ethical theories. Conse- 
quentialism is the philosophical posi- 
tion that the rightness or wrongness of 
an action is defined in terms of its conse- 
quences.™ Utilitarianism is a type of con- 
sequentialism that, like decision theory 
and game theory, holds that the right ac- 
tion in a situation is the one that maxi- 
mizes a quantitative measure of utility. 
Modern theories of decisions and 
games” contribute the rigorous use of 
probabilities, discounting, and expected 
utilities for dealing with uncertainty in 
perception, belief, and action. 

Where decision theory tends to de- 
fine utility in terms of individual re- 
ward, utilitarianism aims to maxi- 
mize the overall welfare of everyone in 
society.'**? While this avoids some of 
the problems of selfish utility functions, 
it raises new problems. For example, 
caring for one’s family can have lower 
utility than spending the same resourc- 
es to reduce the misery of distant strang- 
ers, and morally repellant actions can be 
justified by the greater good.” 

A concise expected-utility model 
supports efficient calculation. Howev- 
er, it can be quite difficult to formu- 
late a concise model by determining 


the best small set of relevant factors. 
In the field of medical decision-mak- 
ing,” decision analysis models are 
known to be useful, but are difficult 
and time-consuming to formulate. 
Setting up an individual decision 
model requires expertise to enumer- 
ate the possible outcomes, extensive 
literature search to estimate proba- 
bilities, and extensive patient inter- 
views to identify the appropriate utility 
measure and elicit the values of out- 
comes, all before an expected utility cal- 
culation can be performed. Even 
then, a meaningful decision requires 
extensive sensitivity analysis to deter- 
mine how the decision could be af- 
fected by uncertainty in the estimates. 
While this process is not feasible for 
making urgent decisions in real time, 
it may still be useful for post-hoc 
analysis of whether a quick decision 
was justified. 

Deontology is the study of duty (deon 
in Greek), which expresses morality and 
ethics in terms of obligations and prohi- 
bitions, often specified as rules and con- 
straints such as the Ten Command- 
ments or Isaac Asimov's Three Laws of 
Robotics. Deontological rules and con- 
straints offer the benefits of simplicity, 
clarity, and ease of explanation, but 
they raise questions of how they are jus- 
tified and where they come from.” Rules 
and constraints are standard tools for 
knowledge representation and infer- 
ence in AI,” and can be implemented 
and used quite efficiently. 

However, in practice, rules and con- 
straints always have exceptions and 
unintended consequences. Indeed, 
most of Isaac Asimov's I, Robot stories‘ 
focus on unintended consequences and 
necessary extensions to his Three Laws. 
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Virtue Ethics holds that the individ- 
ual learns through experience and 
practice to acquire virtues, much as 
an expert craftsman learns skills, and 
that virtues and skills are similarly 
grounded in appropriate knowledge 
about the world.'**? Much of this 
knowledge consists of concrete exam- 
ples that illustrate positive and nega- 
tive examples (cases) of virtuous be- 
havior. An agent who is motivated to 
be more virtuous tries to act more like 
cases of virtuous behavior (and less 
like the non-virtuous cases) that he 
has learned. Phronesis (or “practical 
wisdom”) describes an exemplary 
state of knowledge and skill that sup- 
ports appropriate responses to moral 
and ethical problems. 

A computational method suitable 
for virtue ethics is case-based reason- 
ing,'*? which represents knowledge as 
a collection of cases describing con- 
crete situations, the actions taken in 
those situations, and results of those 
actions. The current situation is 
matched against the stored cases, iden- 
tifying the most similar cases, adapting 
the actions according to the differenc- 
es, and evaluating the actions and out- 
comes. Both rule-based and case-based 
reasoning match the current situation 
(which may be very complex) against 
stored patterns (rules or cases). 

Virtue ethics and deontology differ 
in their approach to the complexity of 
ethical knowledge. Deontology as- 
sumes that a relatively simple abstrac- 
tion (defined by the terms appearing 
in the rules) applies to many specific 
cases, distinguishing between right 
and wrong. Virtue ethics recognizes 
the complexity of the boundaries be- 
tween ethical judgments in the space 


Figure 2. Fractal boundaries. 


Geometric fractal boundaries provide a metaphor for the complexity of the boundaries 
between different ethical evaluations in the high-dimensional space of possible situations. 
Simple boundaries can approximate the fractal set, but can never capture its shape exactly. 
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of possible scenarios (Figure 2), and 
collects individual cases from the 
agent’s experience to characterize 
those boundaries. 

Understanding the whole elephant. 
Utilitarianism, deontology, and virtue 
ethics are often seen as competing, 
mutually exclusive theories of the na- 
ture of morality and ethics. I treat them 
here as three aspects of a more com- 
plex system for making ethical deci- 
sions (inspired by the children’s poem, 
The Blind Men and the Elephant). 

Rule-based and case-based reason- 
ing (AI methods expressing key as- 
pects of deontology and virtue ethics, 
respectively) can, in principle, respond 
in real time to the current situation. 
Those representations also hold prom- 
ise of supporting practical approaches 
to explanation of ethical decisions.*° 
After a decision is made, when time for 
reflection is available, utilitarian rea- 
soning can be applied to analyze 
whether the decision was good or bad. 
This can then be used to augment the 
knowledge base with a new rule, con- 
straint, or case, adding to the agent’s 
ethical expertise (Figure 3). 

Previous work on robot ethics. 
Formal and informal logic-based ap- 
proaches to robot ethics”** express a 
“top-down” deontological approach 
specifying moral and ethical knowl- 
edge. While modal operators like oblig- 
atory or forbidden are useful for ethical 
reasoning, their problem is the diffi- 
culty of specifying or learning critical 
perceptual concepts (see Figure 2), for 


example, non-combatant in Arkin’s ap- 
proach to the Laws of War.’ 

Wallach and Allen” survey issues and 
previous work related to robot ethics, 
concluding that top-down approaches 
such as deontology and utilitarianism 
are either too simplistic to be adequate 
for human moral intuitions, or too com- 
putationally complex to be feasibly im- 
plemented in robots (or humans, for 
that matter). They describe virtue ethics 
as a hybrid of top-down and bottom-up 
methods, capable of naming and assert- 
ing the value of important virtues, while 
allowing the details of those virtues to 
be learned from relevant individual ex- 
perience. They hold that emotions, 
case-based reasoning, and connection- 
ist learning play important roles in ethi- 
cal judgment. Abney’ also reviews ethi- 
cal theories in philosophy, concluding 
that virtue ethics is a promising model 
for robot ethics. 

Scheutz and Arnold disagree, hold- 
ing that the need for a “computation- 
ally explicit trackable means of deci- 
sion making” requires that ethics be 
grounded in deontology and utilitari- 
anism. However, they do not adequate- 
ly consider the overwhelming complex- 
ity of the experienced world, and the 
need for learning and selecting concise 
abstractions of it. 

Recently, attention has been turned 
to human evaluation of robot behavior. 
Malle et al’? asked human subjects to 
evaluate reported decisions by humans 
or robots facing trolley-type problems 
(“Deadly Dilemmas”). The evaluators 


Figure 3. Feedback and time scales in a hybrid ethical reasoning architecture. 


Given a situation S(t), a fast case-based reasoning process retrieves similar cases, defines the 
action A to take, and results in a new situation S’. At a slower time scale, the result is evaluated 
and the new case is added to the case base. Feedback through explanation, justification, and 
communication with others takes place at approximately this slower time scale. Abstraction 

of similar cases to rules and learning of new concepts and relations are at a much slower time 


scale, and social evolution is far slower still. 
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blamed robots when they did not make 
the utilitarian choice, and blamed hu- 
mans when they did. Robinette et al?’ 
found that human subjects will “over- 
trust” a robot in an emergency situa- 
tion, even in the face of evidence that 
the robot is malfunctioning and that 
its advice is bad. 

Representing ethical knowledge as 
cases. Consider a high-level sketch of a 
knowledge representation capable of 
expressing rich cases for case-based 
reasoning, but also highly abstracted 
“cases” that are essentially rules or con- 
straints for deontological reasoning. 

Let a situation S(t) be a rich descrip- 
tion of the current context. “Rich” 
means the information content of 
S(t) is very high, and also that it is avail- 
able in several hierarchical levels, not 
just the lowest “pixel level” description 
that specifies values for a large number 
of low-level elements (like pixels in an 
image). For example, a situation de- 
scription could include symbolic de- 
scriptions of the animate participants 
in a scenario, along with their individu- 
al characteristics and categories they 
might belong to, the relations holding 
among them, and the actions and 
events that have taken place. These sym- 
bolic descriptions might be derived 
from sub-symbolic input (for example, a 
visual image or video) by methods such 
as a deep neural network classifier. 

A case (S, A, S', v) is a description of a 
situation S, the action A taken in that 
situation, the resulting situation S’, 
and a moral evaluation v (or valence) of 
this scenario. A case representing on- 
going experience will be rich, reflecting 
the information-rich sensory input the 
agent receives, and the sophisticated 
processing that produces the hierar- 
chical description. A case representing 
the stored memory of events the agent 
has experienced will be significantly 
less rich. A “story” describing events 
can also be represented as a case, but it 
is less rich yet, consisting of a collec- 
tion of symbolic assertions. An even 
sparser and more schematic case is ef- 
fectively the same as a rule, matching 
certain assertions about a situation S, 
and proposing an action A, the result- 
ing situation S’, and perhaps the evalu- 
ation v of that scenario. 

The antecedent situation S in a case 
(S, A, S', v) need not describe a mo- 
mentary situation. It can describe a 


scenario with temporal extent, includ- 
ing intermediate actions and situations. 

The ethical knowledge of an agent is 
a collection of cases. 

Open research problem. This high- 
level sketch assumes that a morally 
significant action can be adequately 
described in terms of “before” and 
“after” situations, and that an evalua- 
tive valence can be computed, per- 
haps after the fact. Can a useful initial 
computational model of moral rea- 
soning be constructed on this basis, 
or will weaker assumptions be needed 
even to get started? 

Applying ethical case knowledge. 
Following the methods of case-based 
reasoning,'** the current situation S(t) 
is matched against the case-base, to 
retrieve the stored cases with anteced- 
ents most similar to the current situa- 
tion. For example, suppose that the ethi- 
cal knowledge base includes two cases: 
(Sy, Ao, S2, bad) and (S3, Au, Sa, good), 
and S(d)is similar to both S; and $3. Then, 
in the current situation S(t), the knowl- 
edge base would recommend -do(A,,t) 
and do(A,,t). 

For example, suppose the current 
situation S(t) includes two people, P 
and Q, in conflict, the case antecedent 
Sı describes P and Q as fighting, and A, 
describes P killing Q. In this case, in S», 
person Q is dead, which is bad. 

As a rich representation of experi- 
ence, (S1, As, S2, bad) would be highly 
detailed and specific. As a story, say the 
Biblical story of Cain and Abel, it would 
be much less rich, but would still con- 
vey the moral prohibition against kill- 
ing. It could be abstracted even further, 
to essentially a deontological rule: 
Thou shalt not kill. The more abstracted 
the antecedent, the more likely the 
stored case is to match a given situa- 
tion, but the less likely this case is to 
distinguish adequately among cases 
with different moral labels. 

Situation S(t) also matches ante- 
cedent S; which describes P and Q as 
arguing, A, describes them reaching 
an agreement, and S, has them both 
alive, which is good. Having retrieved 
both cases, the right behavior is to try 
to follow case (S53, As, Si, good) and 
avoid case (S1, A2, S2, bad), perhaps by 
taking other actions to make S(t) more 
similar to S; and less similar to S4. 

An essential part of case-based rea- 
soning is the ability to draw on several 
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similar cases, adapting their actions 
to create a new action that is more ap- 
propriate to S(t) than the actions from 
either of the stored cases. This adap- 
tation can be used to interpolate be- 
tween known cases with the same va- 
lence, or to identify more precisely 
the boundary between cases of oppo- 
site valence. 

Responsiveness, deliberation, and 
feedback. Some ethical decisions must 
be made quickly, treating case ante- 
cedents as patterns to be matched to 
the current situation S(t). Some cases 
are rich and highly specific to particu- 
lar situations, while others are sparse, 
general rules that can be used to con- 
strain the set of possible actions. 

Once an action has been selected and 
performed, there may be time for delib- 
eration on the outcome, to refine the 
case evaluation and benefit from feed- 
back. Simply adding a case describing 
the new experience to the knowledge 
base improves the agent’s ability to pre- 
dict the results of actions and decide 
more accurately what to do in future sit- 
uations. Thus, consequentialist (includ- 
ing utilitarian) analysis becomes a slow- 
er feedback loop, too slow to determine 
the immediate response to an urgent 
situation, but able to exploit informa- 
tion in the outcome of the selected ac- 
tion to improve the agent’s future deci- 
sions in similar situations (Figure 3). 

Open research problem. How do rea- 
soning processes at different time-scales 
allow us to combine apparently incom- 
patible mechanisms to achieve appar- 
ently incompatible goals? What concrete 
multi-time-scale architectures are useful 
for moral and ethical judgment, and im- 
provement through learning? 

Explanation. In addition to making 
decisions and carrying out actions, an 
ethical agent must be able to explain 
and justify the decision and action,” 
providing several distinct types of feed- 
back to improve the state of the ethical 
knowledge base. 

Suppose agent P faces a situation, 
makes a decision, carries it out, and ex- 
plains his actions to agent Q. If P is an 
exemplary member of the society and 
makes a good decision, Q can learn 
from P’s actions and gain in expertise. 
If P makes a poor decision, simply be- 
ing asked to explain himself gives P an 
opportunity to learn from his own mis- 
take, but Q may also give P instructions 
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and insights that will help P make bet- 
ter decisions in the future. Even if P has 
made a poor decision and refuses to 
learn from the experience, Q can still 
learn from P’s bad example. 

Explanation is primarily a mecha- 
nism whereby individuals come to 
share the society’s consensus beliefs 
about morality and ethics. However, 
the influence is not only from the so- 
ciety to individuals. Explanations and 
insights can be communicated from 
one person to another, leading to evo- 
lutionary social change. As more and 
more individuals share a new view of 
morality and ethics, the society as a 
whole approaches a tipping point, af- 
ter which society’s consensus posi- 
tion can change with startling speed. 

Learning ethical case knowledge. A 
child learns ethical knowledge in the 
form of simple cases provided by par- 
ents and other adults: rules, stories, 
and labels for experienced situations. 
These cases express social norms for 
the child. 

An adult experiences a situation 
S(t), retrieves a set of similar cases, 
adapts the actions from those cases to 
an action A for this situation, performs 
that action, observes the result S’, and 
assigns a moral valence v. A new case 
(S, A, S’, v) is constructed and added to 
the case base (Figure 3). With increas- 
ing experience, more cases will match 
a given S(t), and the case-base will 
make finer-grained distinctions among 
potential behaviors. The metaphor of 
the fractal boundary between good and 
bad ethical judgments in knowledge 
space (Figure 2), implies that a good 
approximation to this boundary re- 
quires both a large number of cases 
(quantity) and correct placement and 
labeling of those cases (quality). 

Once the case base accumulates 
clusters of cases with similar but not 
identical antecedents, then some of 
those clusters can be abstracted to 
much sparser cases (that is, rules), that 
make certain actions forbidden or 
obligatory in certain situations. The 
cluster of cases functions as a labeled 
training set for a classification prob- 
lem to predict the result and evaluation 
of an action in antecedent situations in 
that cluster. This can determine which 
attributes of the antecedent cases are 
essential to a desired result and evalua- 
tion, and which are not. 


94 COMMUNICATIONS OF THE ACM MARCH 2018 


Open research problem. Is it neces- 
sary to distinguish between ethical and 
non-ethical case knowledge, or is this 
approach appropriate for both kinds of 
skill learning? 

Open research problem. Sometimes, a 
correct ethical judgment depends on 
learning a new concept or category, 
such as non-combatant’ or self-defense. 
Progress in deep neural network learn- 
ing methods may be due to autonomous 
learning of useful intermediate con- 
cepts. However, it remains difficult to 
make these intermediate concepts ex- 
plicit and available for purposes such 
as explanation or extension to new 
problems. Furthermore, these meth- 
ods depend on the availability and 
quality of large labeled training sets. 

Open research problem. What mech- 
anisms are available for expressing ap- 
propriate abstractions from rich expe- 
rience to the features that enable 
tractable discrimination between mor- 
al categories? In addition to deep neu- 
ral network learning, other examples 
include similarity measures among 
cases for case-based reasoning and 
kernels for support vector machines. 
How can these abstractions be learned 
from experience? 


The Deadly Dilemma 

The self-driving car is an intelligent 
robot whose autonomous decisions 
have potential to cause great harm to 
individual humans. People often ask 
about a problem I call the Deadly Di- 
lemma: How should a self-driving car 
respond when faced with a choice be- 
tween hitting a pedestrian (possibly a 
small child who has darted into the 
street), versus crashing and harming 
its passengers.” 

Either choice, of course, leads to a 
serious problem with the trustworthi- 
ness of the robot car. If the robot would 
choose to kill the pedestrian to save it- 
self and its passengers, then why 
should the public trust such robots 
enough to let them drive on public 
roads? If the robot could choose to 
harm its passengers, then why would 
anyone trust such a robot car enough 
to buy one? 

The self-driving car could be a bell- 
wether for how autonomous robots 
will relate to the social norms that sup- 
port society. However, while the Deadly 
Dilemma receives a lot of attention, the 
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stark dilemma distracts from the im- 
portant problems of designing a trust- 
worthy self-driving car. 

Learning to avoid the dilemma. As 
stated, the Deadly Dilemma is difficult 
because it presents exactly two options, 
both bad (hence, the dilemma). The 
Deadly Dilemma is also extremely 
rare. Far more often than an actual 
Deadly Dilemma, an agent will expe- 
rience Near Miss scenarios, where 
the dire outcomes of the Dilemma 
can be avoided, often by identifying 
“third way” solutions other than the 
two bad outcomes presented by the 
Dilemma. These experiences can 
serve as training examples, helping 
the agent learn to apply its ethical 
knowledge on solvable problems, ac- 
quiring “practical wisdom” about 
avoiding the Deadly Dilemma. 

Sometimes, when reflecting on a 
Near Miss after the fact, the agent can 
identify an “upstream” decision point 
where a different choice would have 
avoided the Dilemma entirely. For ex- 
ample, it can learn to notice when a 
small deviation from the intended 
plan could be catastrophic, or when a 
pedestrian could be nearby but hid- 
den. A ball bouncing into the street 
from between parked cars poses no 
threat to a passing vehicle, but a good 
driver slows or stops immediately, be- 
cause a small child could be chasing it. 
Implementing case-based strategies 
like these for a self-driving car may re- 
quire advances in both perception and 
knowledge representation, but these 
advances are entirely feasible. 

Earning trust. An agent earns trust 
by showing that its behavior consis- 
tently accords with the norms of soci- 
ety. The hybrid architecture described 
here sketches a way that an agent can 
learn about those social norms from its 
experience, responding quickly to situ- 
ations as they arise, but then more 
slowly learning by reflecting on its suc- 
cesses and failures, and identifying 
useful abstractions and more efficient 
rules based on that experience. 

In ordinary driving, the self-driving 
car earns trust by demonstrating that 
it obeys social norms, starting with 
traffic laws, but continuing with cour- 
teous behavior, signaling its inten- 
tions to pedestrians and other driv- 
ers, taking turns, and deferring to 
others when appropriate. In crisis sit- 


uations, it demonstrates its ability to 
use its situational awareness and fast 
reaction time to find “third ways” out 
of Near Miss scenarios. Based on 
post-hoc crisis analyses, whether the 
outcome was success or failure, it may 
be able to learn to identify upstream 
decision points that will allow it to 
avoid such crises in the first place. 

Technological advances, particu- 
larly in the car’s ability to predict the 
intentions and behavior of other 
agents, and in the ability to anticipate 
potential decision points and places 
that could conceal a pedestrian, will 
certainly be important to reaching this 
level of behavior. We can be reasonably 
optimistic about this kind of cognitive 
and perceptual progress in machine 
learning and artificial intelligence. 

Since 94% of auto crashes are associ- 
ated with driver error,” there will be 
plentiful opportunities to demonstrate 
trustworthiness in ordinary driving 
and solvable Near Miss crises. Both so- 
ciety and the purchasers of self-driving 
cars will gain substantially greater per- 
sonal and collective safety in return for 
slightly more conservative driving. 

For self-driving cars sharing the 
same ethical knowledge base, the be- 
havior of one car provides evidence 
about the trustworthiness of all others, 
leading to rapid convergence. 


Conclusion 
Trust is essential for the successful 
functioning of society. Trust is neces- 
sary for cooperation, which produces 
the resources society needs. Morality, 
ethics, and other social norms encour- 
age individuals to act in trustworthy 
ways, avoiding selfish decisions that 
exploit vulnerability, violate trust, and 
discourage cooperation. As we contem- 
plate the design of robots (and other 
Als) that perceive the world and select 
actions to pursue their goals in that 
world, we must design them to follow 
the social norms of our society. Doing 
this does not require them to be true 
moral agents, capable of genuinely tak- 
ing responsibility for their actions. 

Social norms vary by society, so ro- 
bot behavior will vary by society as 
well, but this is outside the scope of 
this article. 

The major theories of philosophi- 
cal ethics provide clues toward the de- 
sign of such AI agents, but a success- 


ful design must combine aspects of all 
theories. The physical and social envi- 
ronment is immensely complex. Even 
so, some moral decisions must be 
made quickly. But there must also be a 
slower deliberative evaluation proc- 
ess, to confirm or revise the rapidly re- 
sponding rules and constraints. At 
longer time scales, there must be 
mechanisms for learning new con- 
cepts for virtues and vices, mediating 
between perceptions, goals, plans, 
and actions. The technical research 
challenges are how to accomplish all 
these goals. 

Self-driving cars may well be the 
first widespread examples of trustwor- 
thy robots, designed to earn trust by 
demonstrating how well they follow so- 
cial norms. The design focus for self- 
driving cars should not be on the Dead- 
ly Dilemma, but on how a robot’s 
everyday behavior can demonstrate its 
trustworthiness. 
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Technical Perspective 
A Graph-Theoretic Framework 
Traces Task Planning 


By Nicole Immorlica 


ALGORITHMIC GAME THEORY has made 
great strides in recent decades by as- 
suming standard economic models of 
rational agent behavior to study out- 
comes in distributed computational 
settings. From the analysis of Internet 
routing to the design of advertisement 
auctions and crowdsourcing tasks, re- 
searchers leveraged these models to 
characterize the performance of the 
underlying systems and guide practi- 
tioners in their optimization. These 
models have tractable mathematical 
formulations and broadly applicable 
conclusions that drive their success, 
but they rely strongly on the assump- 
tion of rationality. 

The assumption of rationality is 
at times questionable, particularly in 
systems in which human actors make 
most of the decisions and in systems 
that evolve over time. Humans, sim- 
ply put, are bad at thinking about the 
impact of their actions on their en- 
vironment and their future. We see 
this every day in the way we manage 
our time. Students cram for exams 
despite planning not to, and even 
though it is well documented that 
well-spaced studying produces im- 
proved learning results with equal ef- 
fort. Humans in lab experiments also 
consistently exhibit similar irrational 
time-inconsistent planning and pro- 
crastination behavior. 

In the 1990s, economists proposed 
an alternate model of agent behavior 
called quasi-hyperbolic discounting, 
which incorporates time-inconsisten- 
cies and procrastination. In this mod- 
el, agents overinflate the cost of cur- 
rent actions with respect to future days’ 
actions. Thus, an hour of studying to- 
day might be as painful as two hours of 
studying on any future day. Note this 
causes agents to have a time-inconsis- 
tent view of the future: today, the stu- 
dent thinks that tomorrow’s costs are 
equal to that of the day after, but when 
tomorrow comes, the student will re- 
evaluate and decide that in fact tomor- 
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row’s costs are greater than those of 
the day after. This time-inconsistency 
can cause agents to procrastinate in- 
definitely and abandon valuable goals. 
With the growth of personalized 
computing, it is especially important 
for researchers to design and analyze 
systems in the presence of irrational 
human behavior, such as that de- 
scribed by quasi-hyperbolic discount- 
ing. Our cellphones guide us through 
our lives, helping us, for example, to 
manage our time, optimize our diets, 
and achieve our fitness goals. To do 
so effectively, these apps need a pre- 
dictive and mathematically tractable 
model of our behavior. The quasi-hy- 
perbolic discounting model is prom- 
ising: economists have shown in both 
lab and field experiments that it is 
highly predictive. However, the prior 
literature fails to provide a suitable 
framework in which to reason about 
quasi-hyperbolic discounting in the 
presence of complex planning tasks. 
In the following paper, Kleinberg 
and Oren describe a graph-theoretic 
framework for task planning with 
quasi-hyperbolic discounting. In 
their framework, the goal and the in- 
termediate tasks are nodes in a graph, 
and the weights of the (directed) 
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edges represent the costs of advanc- 
ing from one task to the next. The 
intended present and future actions 
of a quasi-hyperbolic discounter are 
simply weighted shortest-paths in 
this graph. 

This formulation allows research- 
ers to use the extensive existing 
knowledge of graph algorithms to 
analyze and optimize task planning. 
In the paper, Kleinberg and Oren use 
their framework to derive many re- 
sults. Among these, they characterize 
tasks that are particularly susceptible 
to procrastination: they are those 
that contain a simple structure as a 
graph minor. They also explore ways 
to reduce procrastination by choice 
reduction: by scheduling a midterm 
exam, an instructor can remove the 
choice of cramming for the final in 
the last week of class, resulting in 
better study habits. 

The framework of Kleinberg and 
Oren, and the characterization and 
optimization results it enables, is a 
step forward in incorporating sophis- 
ticated models of human behavior 
into computational systems. Apps 
designed to increase individual ef- 
fectiveness and related products can 
use this framework to help us achieve 
our personal goals. They can predict 
when we are in danger of procras- 
tinating, and perhaps, by cleverly 
hiding the availability of certain ac- 
tions, they can even help us mitigate 
the extent of procrastination. Subse- 
quent and future work has and will 
continue to use the framework to 
develop many more such useful re- 
sults. In our complex modern world 
with its growing plethora of choices, 
automated planning of the sort aided 
by this paper and the research it in- 
spires is indispensable. 
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Time-Inconsistent Planning: 
A Computational Problem 
in Behavioral Economics 


By Jon Kleinberg and Sigal Oren 


Abstract 
In many settings, people exhibit behavior that is inconsis- 
tent across time—we allocate a block of time to get work 
done and then procrastinate, or put effort into a project and 
then later fail to complete it. An active line of research in 
behavioral economics and related fields has developed and 
analyzed models for this type of time-inconsistent behavior. 
Here we propose a graph-theoretic model of tasks and 
goals, in which dependencies among actions are repre- 
sented by a directed graph, and a time-inconsistent agent 
constructs a path through this graph. We first show how 
instances of this path-finding problem on different input 
graphs can reconstruct a wide range of qualitative phenom- 
ena observed in the literature on time-inconsistency, includ- 
ing procrastination, abandonment of long-range tasks, and 
the benefits of reduced sets of choices. We then explore a set 
of analyses that quantify over the set of all graphs; among 
other results, we find that in any graph, there can be only 
polynomially many distinct forms of time-inconsistent 
behavior; and any graph in which a time-inconsistent agent 
incurs significantly more cost than an optimal agent must 
contain a large “procrastination” structure as a minor. 
Finally, we use this graph-theoretic model to explore ways 
in which tasks can be designed to motivate agents to reach 
designated goals. 


1. INTRODUCTION 
A fundamental issue in behavioral economics—and in the 
modeling of individual decision-making more generally—is 
to understand the effects of decisions that are inconsistent 
over time. Examples of such inconsistency are widespread in 
everyday life: we make plans for completing a task but then 
procrastinate; we put work into getting a project partially 
done but then abandon it; we pay for gym memberships but 
then fail to make use of them. In addition to analyzing and 
modeling these effects, there has been increasing interest 
in incorporating them into the design of policies and incen- 
tives in domains that range from health to personal finance. 
These types of situations have a recurring structure: a 
person makes a plan at a given point in time for something 
they will do in the future (finishing homework, exercising, 
paying off a loan), but at a later point in time they fail to fol- 
low through on the plan. Sometimes this failure is the result 
of unforeseen circumstances that render the plan invalid— 
a person might join a gym but then break their leg and be 
unable to exercise—but in many cases the plan is abandoned 
even though the circumstances are essentially the same as 


they were at the moment the plan was made. This presents 
a challenge to any model of planning based on optimizing a 
utility function that is consistent over time: in an optimiza- 
tion framework, the plan must have been an optimal choice 
at the outset, but later it was optimal to abandon it. A line 
of work in the economics literature has thus investigated 
the properties of planning with objective functions that vary 
over time in certain natural and structured ways. 


A basic example and model 

To introduce these models, it is useful to briefly describe 
an example due to George Akerlof,' with the technical 
details adapted slightly for the discussion here. (The story 
will be familiar to readers who know Akerlof’s paper; we 
cover it in some detail because it will motivate a useful and 
recurring construction later in the work.) Imagine a deci- 
sion-making agent—Akerlof himself, in his story—who 
needs to ship a package sometime during one of the next 
n days, labeled t = 1, 2, ..., n, and must decide on which 
day t to do so. Each day that the package has not been sent 
results in a fixed cost of 1 (per day), due to the lack of use of 
the package’s contents; this means a cost of t if itis shipped on 
day t. (For simplicity, we will disregard the time the pack- 
age spends in transit, which is a constant additional cost 
regardless of when it is shipped.) Also, shipping the pack- 
age is an elaborate operation that will result in one-time 
cost of c > 1, due to the amount of time involved in getting 
it sent out. The package must be shipped during one of the 
n specified days. 

What is the optimal plan for shipping the package? Clearly 
the cost c will be incurred exactly once regardless of the day 
on which it is shipped, and there will also be a cost of t if it 
is shipped on day t. Thus we are minimizing c + t subject to 
1<t<n;the cost is clearly minimized by setting t= 1. In other 
words, the agent should ship the package right away. 

But in Akerlof’s story, he did something that should be 
familiar from one’s own everyday experience: he procrasti- 
nated. Although there were no unexpected changes to the 
trade-offs involved in shipping the package, when each 
new day arrived there seemed to be other things that were 
more crucial than sending it out that day, and so each day he 
resolved that he would instead send it tomorrow. The result 
was that the package was not sent out until the end of the 


The original version of this paper was published in the 
Proceedings of the 15th ACM Conference on Economics and 
Computation (Palo Alto, CA, June 8-12, 2014), 547-564. 
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time period. (In fact, he sent it a few months into the time 
period once something unexpected happened to change the 
cost structure—a friend offered to send it as part of a larger 
shipment—though this wrinkle is not crucial for the story.) 
There is a natural way to model an agent’s decision to 
procrastinate, using the notion of present bias—the ten- 
dency to view costs and benefits that are incurred at the pres- 
ent moment to be more salient than those incurred in the 
future. In particular, suppose that for a constant b > 1, costs 
that one must incur in the current time period are increased 
by a factor of b in one’s evaluation.’ Then in Akerlof’s exam- 
ple, when the agent on day t is considering the decision to 
send the package, the cost of sending it on day tis bc + t, while 
the cost of sending it on day t+ 1 isc + t + 1. The difference 
between these two costs is (b - 1)c — 1, and so if (b - 1)c > 1, the 
agent will decide on each day t that the optimal plan is to wait 
until day t + 1; things will continue this way until day n, when 
waiting is no longer an option and the package must be sent. 


Quasi-hyperbolic discounting 

Building on considerations such as those above, and oth- 
ers in earlier work in economics,'® ® a significant amount of 
work has developed around a model of time-inconsistency 
known as quasi-hyperbolic discounting.” In this model, 
parametrized by quantities (, ô < 1, a cost or reward of value 
c that will be realized at a point t = 1 time units into the 
future is evaluated as having a present value of (36‘c. (In other 
words, values at time ¢ are discounted by a factor of 36°.) With 
(G=1 this is the standard functional form for exponential dis- 
counting, but when (< 1 the function captures present bias 
as well: values in the present time period are scaled up by 8“ 
relative to all other periods. (In what follows, we will consis- 
tently use b to denote 67.) 

Research on this ((, 6)-model of discounting has been 
extensive, and has proceeded in a wide variety of direc- 
tions; see Ref. Frederick et al.’ for a review. To keep our 
analysis clearly delineated in scope, we make certain deci- 
sions at the outset relative to the full range of possible 
research questions: we focus on a model of agents who 
are naive, in that they do not take their own time-inconsis- 
tency into account when planning; we do not attempt to 
derive the (8, 5)-model from more primitive assumptions 
but rather take it as a self-contained description of the 
agent’s observed behavior; and we discuss the case of 6 = 
1 so as to focus attention on the present-bias parameter (3. 
Note that the initial Akerlof example has all these proper- 
ties; it is essentially described in terms of the (3, 6)-model 
with an agent who is naive about his own time-inconsis- 
tency, with 6 = 1, and with 8 = b~ (using the parameter b 
from that discussion). 

Our starting point in this paper is to think about some of 
the qualitative predictions of the (8, 6)-model, and how to 
analyze them in a unified framework. In particular, research 
in behavioral economics has shown how agents making 
plans in this model can exhibit the following behaviors. 


a Note that there is no time-discounting in this example, so the factor of b 
is only applied to the present time period, while all future time periods are 
treated equally. We will return to the issue of discounting shortly. 
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1. Procrastination, as discussed above. 

2. Abandonment of long-range tasks, in which a person 
starts on a multi-stage project but abandons it in the 
middle, even though the costs and benefits of the proj- 
ect have remained essentially unchanged.” 

3. The benefits of choice reduction, in which reducing the 
set of options available to an agent can actually help 
them reach a goal more efficiently.” * A canonical 
example is the way in which imposing a deadline can 
help people complete a task that might not get fin- 
ished in the absence of a deadline.* 


These consequences of time-inconsistency, as well as a 
number of others, have in general each required their own 
separate and sometimes quite intricate modeling efforts. 
It is natural to ask whether there might instead be a single 
framework for representing tasks and goals in which all of 
these effects could instead emerge “mechanically,” each 
just as a different instance of the same generic computa- 
tional problem. With such a framework, it would become 
possible to search for worst-case guarantees, by quantifying 
over all instances, and to talk about designing or modifying 
given task structures to induce certain desired behaviors. 


The present work: A graph-theoretic model 

Here we propose such a framework, using a graph-theoretic 
formulation. We consider an agent with present-bias param- 
eter 8 who must construct a path in a directed acyclic graph 
G with edge costs, from a designated start node s to a des- 
ignated target node t. We assume without loss of general- 
ity that all the edges of G lie on some s - t path. We will call 
such a structure a task graph. Informally, the nodes of the 
task graph represent states of intermediate progress toward 
the goal t, and the edges represent transitions between 
them. Directed graphs have been shown to have consider- 
able expressive power for planning problems in the artifi- 
cial intelligence literature;'* this provides evidence for the 
robustness of a graph-based approach in representing these 
types of decision environments. Our concerns in this work, 
however, are quite distinct from the set of graph-based plan- 
ning problems in artificial intelligence, since our aim is 
to study the consequences of time-inconsistency in these 
domains. 

Asample instance of this problem is depicted in Figure 1, 
with the costs drawn on the edges. When the agent is stand- 
ing at a node v, it determines the minimum cost of a path 
from v to t, but it does so using its present-biased evaluation 
of costs: the cost of the first edge on the path (starting from v) 
is evaluated according to the true cost, and all subsequent 
edges have their costs reduced by 8. If the agent chooses 
path P, it follows just the first edge (v, w) of P, and then it 
re-evaluates which path to follow using this same present- 
biased evaluation but now from w. In this way, the agent 
iteratively constructs a path from s to t. 


> For purposes of our discussion, we distinguish abandonment of a task 
from the type of procrastination exhibited by Akerlof’s example, in which 
the task is eventually finished, but at a much higher cost due to the effect of 
procrastination. 


Figure 1. A present-biased agent must choose a path from s to t. 


In the next section we will show how our graph-theo- 
retic model easily captures time-inconsistency phenom- 
ena including procrastination, abandonment, and choice 
reduction. But to make the definitions concrete, it is use- 
ful to work through the agent’s computation on the graph 
depicted in Figure 1. As shown in Figure 1, an agent that has 
a present-bias parameter of 3 = 1/2 needs to go from s to t. 
From s, the agent evaluates the path s-a-b-t as having cost 16 + 
26+ 2( = 18, the path s-c-d-t as having cost 8 + 88 + 83 = 16, 
and the path s-c-e-t as having cost 8 + 23+ 160 = 17. Thus 
the agent traverses the edge (s, c) and ends up at c. From c, 
the agent now evaluates the path c-d-t as having cost 8 + 88 
= 12 and the path c-e-t as having cost 2 + 163 = 10, and so 
the agent traverses the edge (c, e) and then (having no further 
choices) continues on the edge (e, t). 

This example illustrates a few points. First, when the 
agent set out on the edge (s, c), it was intending to next fol- 
low the edge (c, d), but when it got to c, it changed its mind 
and followed the edge (c, e). A time-consistent agent (with 
8 = 1), in contrast, would never do this; the path it decides 
to take starting at s is the path it will continue to follow all 
the way to t. Second, we are interested in whether the agent 
minimizes the cost of traveling from s to t according to the 
real costs, not according to its evaluation of the costs, and in 
this regard it fails to do so; the shortest path is s-a-b-t, with a 
cost of 20, while the agent incurs a cost of 26. 


Overview of results 

Our graph-theoretic framework makes it possible to reason 
about time-inconsistency effects that arise in very differ- 
ent settings, provided simply that the underlying decisions 
faced by the agent can be modeled as the search for a path 
through a graph-structured sequence of options. And per- 
haps more importantly, since it is tractable to ask questions 
that quantify over all possible graphs, we can cleanly com- 
pare different scenarios, and search for the best or worst 
possible structures relative to specific objectives. This is dif- 
ficult to do without an underlying combinatorial structure. 
For example, suppose we were inspired by Akerlof’s example 
to try identifying the scenario in which time-inconsistency 
leads to the greatest waste of effort. A priori, it is not clear 
how to formalize the search over all possible “scenarios.” 
But as we will see, this is precisely something we can do if 
we simply ask for the graph in which time-inconsistency 
produces the greatest ratio between the cost of the path tra- 
versed and cost of the optimal path. 


Moreover, with this framework in place, it becomes eas- 
ier to express formal questions about design for these con- 
texts: ifas a designer of a complex task we are able to specify 
the underlying graph structure, which graphs will lead time- 
inconsistent agents to reach the goal efficiently? 

Our core questions are based on quantifying the inef- 
ficiency from time-inconsistent behavior, designing task 
structures to reduce this inefficiency, and comparing the 
behavior of agents with different levels of time-inconsis- 
tency. Specifically, we ask: 


1. In which graph structures does time-inconsistent 
planning have the potential to cause the greatest waste 
of effort relative to optimal planning? 

2. How do agents with different levels of present bias 
(encoded as different values of 8) follow combinatori- 
ally different paths through a graph toward the same 
goal? 

3. Can we increase an agent’s efficiency in reaching a 
goal by deleting nodes and/or edges from the underly- 
ing graph, thus reducing the number of options 
available? 


In what follows, we address these questions in turn. 
For the first question, we consider n-node graphs and ask 
how large the cost ratio can be between the path followed 
by a present-biased agent and the path of minimum 
total cost. Since deviations from the minimum-cost plan 
due to present bias are sometimes viewed as a form of 
“irrational” behavior, this cost ratio effectively serves as 
a “price of irrationality” for our system. We give a char- 
acterization of the worst-case graphs in terms of graph 
minors; this enables us to show, roughly speaking, that 
any instance with sufficiently high cost ratio must con- 
tain a large instance of the Akerlof example embedded 
inside it. 

For the second question, we consider the possible paths 
followed by agents with different present-bias parameters £. 
As we sweep (3 over the interval [0, 1], we have a type of para- 
metric path problem, where the choice of path is governed 
by a continuous parameter (6 in this case). We show that in 
any instance, the number of distinct paths is bounded by a 
polynomial function of n, which forms an interesting con- 
trast with canonical formulations of the parametric short- 
est-path problem, in which the number of distinct paths can 
be super polynomial in n." 

Lastly, for the third question, we show how it is possible 
for agents to be more efficient when nodes and/or edges 
are deleted from the underlying graph; on the other hand, 
if we want to motivate an agent to follow a particular path P 
through the graph, it can be crucial to present the agent with 
a subgraph that includes not just P but also certain addi- 
tional nodes and edges that do not belong to P. We give a 
graph-theoretic characterization of the possible subgraphs 
supporting efficient traversal. 

Before turning to these questions, we first discuss the 
basic graph-theoretic problem in more detail, showing how 
instances of this problem capture the time-inconsistency 
phenomena discussed earlier in this section. 
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2. THE GRAPH-THEORETIC MODEL 

In order to argue that our graph-theoretic model captures a 
variety of phenomena that have been studied in connection 
with time-inconsistency, we present a sequence of examples 
to illustrate some of the different behaviors that the model 
exhibits. We note that the example as shown in Figure 1 
already illustrates two simple points: that the path chosen 
by the agent can be suboptimal; and that even if the agent 
traverses an edge e with the intention of following a path P 
that begins with e, it may end up following a different path P’ 
that also begins with e. 

For an edge e in G, let c(e) denote the cost of e; and for 
a path P in G, let e(P) denote the 7 edge on P. In terms of 
this notation, the agent’s decision is easy to specify: when 
standing at a node v, it chooses the path P that minimizes 
c(e,(P))+ BY. cle (P) over all P that run from v to t. It fol- 
lows the first edge of P to a new node w, and then performs 
this computation again. 

We begin by observing that Figure 2(a) represents a version 
of the Akerlof example from the introduction. (Recall that 
here we use b to denote /3".) Node t represents the state in 
which the agent has sent the package, and node v, represents 
the state in which the agent has reached day i without send- 
ing the package. The agent has the option of going directly 
from node s to node t, and this is the shortest s-t path. But if 
(b-1)c >b, then the agent will instead go from s to v,, intend- 
ing to complete the path s-v,-¢ in the next time step. At v,, 
however, the agent decides to go to v,, intending to complete 
the path v,-v,-t in the next time step. This process continues: 
the agent, following exactly the reasoning in the example 
from the introduction, is procrastinating and not going to t, 
and in the end its path goes all the way to the last node v, (n= 
5 in the figure) before finally taking an edge to t. (One minor 
change from the set-up in the introduction is that the pres- 
ent-bias effect is also applied to the per-day cost of 1 as well; 
this has no real effect on the underlying story.) 


Figure 2. Path problems that exhibit procrastination, abandonment, 
and choice reduction. 
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Extending the model to include rewards 

Thus far we can not talk about an agent who abandons 
its pursuit of the goal midway through, since our model 
requires the agent to construct a path that goes all the way to t. 
A simple extension of the model enables to consider such 
situations. 

Suppose we place a reward ofr at the target node t, which 
will be claimed if the agent reaches t. Standing at a node 
v, the agent now has an expanded set of options: it can fol- 
low an edge out of v as before, or it can quit taking steps, 
incurring no further cost but also not claiming the reward. 
The agent will choose the latter option precisely when 
either there is no v-t path, or when the minimum cost of a 
v-t path exceeds the value of the reward, evaluated in light of 
present bias: c(e,(P))+ BY, c(e,(P))> Br for all v-t paths P. 
It is important to note a key feature of this evaluation: the 
reward is always discounted by £ relative to the cost that is 
being incurred in the current period, even if the reward will 
be received right after this cost is incurred. (e.g., if the path P 
has a single edge, then the agent is comparing c(e (P)) to 6r.) 

In what follows, we will consider both these models: the 
former fixed-goal model, in which the agent must reach t and 
seeks to minimize its cost; and the latter reward model in 
which the agent trades off cost incurred against reward at t, 
and has the option of stopping partway to t. Aside from this 
distinction, both models share the remaining ingredients, 
based on traversing an s-t path in G. 

It is easy to see that the reward model displays the phe- 
nomenon of abandonment, in which the agent spends some 
cost to try reaching t, but then subsequently gives up with- 
out receiving the reward. Consider for example a three-node 
path on nodes s, v,, and t, with an edge c(s, v,) = 1 and c(y,, t) 
= 4. If 3 = 1/2 and there is a reward of 7 at t, then the agent 
will traverse the edge (s, v,) because it evaluates the total cost 
of the path at 1+ 4 = 3 < 73 = 3.5. But once it reaches v., it 
evaluates the cost of completing the path at 4 > 78 = 3.5, and 
so it quits without reaching t. 


An example involving choice reduction 

It is useful to describe a more complex example that shows 
the modeling power of this shortest-path formalism, and 
also shows how we can use the model to analyze deadlines 
as a form of beneficial choice reduction. (As should be clear, 
with a time-consistent agent it can never help to reduce the 
set of choices; such a phenomenon requires some form of 
time-inconsistency.) First we describe the example in text, 
and then show how to represent it as a graph. 

Imagine a student taking a three-week short course in 
which the required work is to complete two small projects 
by the end of the course. It is up to the student when to do 
the projects, as long as they are done by the end. The stu- 
dent incurs an effort cost of one from any week in which 
she does no projects (since even without projects there is 
still the lower-level effort of attending class), a cost of four 
from any week in which she does one project, and a cost 
of nine from any week in which she does both projects. 
Finally, the student receives a reward of 16 for complet- 
ing the course, and she has a present-bias parameter of 
B=1/2. 


Figure 2(b) shows how to represent this scenario using a 
graph. Node v, corresponds to a state in which i weeks of the 
course are finished, and the student has completed j proj- 
ects so far; we have s = v „ and t=v,,. All edges go one column 
to the right, indicating that one week will elapse regardless; 
what is under the student’s control is how many rows the 
edge will span. Horizontal edges have cost one, edges that 
descend one row have cost four, and edges that descend two 
rows in a single hop have cost nine. In this way, the graph 
precisely represents the story just described. 

How does the student’s construction of an s-t path work 
out? From s, she goes to v, and then to v,,, intending to com- 
plete the path to t via the edge (v, t). But at v,,, she evaluates 
the cost of the edge (v,,, t) as 9 > 8r = 16/2 = 8, and so she quits 
without reaching t. The story is thus a familiar one: the stu- 
dent plans to do both projects in the final week of the course, 
but when she reaches the final week, she concludes that it 
would be too costly and so she drops the course instead. 

The instructor can prevent this from happening through 
avery simple intervention. If he requires that the first project 
be completed by the end of the second week of the course, 
this corresponds simply to deleting node v,, from the graph. 
With v,, gone, the path-finding problem changes: now the 
student starting at s decides to follow the path s-v,,-v,,-t, and 
at v,, and then v, she continues to select this path, thereby 
reaching t. Thus, by reducing the set of options available to 
the student—and in particular, by imposing an intermedi- 
ate deadline to enforce progress—the instructor is able to 
induce the student to complete the course. 

There are many stories like this one about homework and 
deadlines, and our point is not to focus too closely on it in 
particular. Indeed, to return to one of the underpinnings of 
our graph-theoretic formalism, our point is in a sense the 
opposite: it is hard to reason about the space of possible 
“stories,” whereas it is much more tractable to think about 
the space of possible graphs. Thus by encoding the set of 
stories mechanically in the form of graphs, it becomes fea- 
sible to reason about them as a whole. 

We have thus seen how a number of different time- 
inconsistency phenomena arise in simple instances of the 
path-finding problem. The full power of the model, how- 
ever, lies in proving statements that quantify over all graphs; 
we begin this next. 


3. THE COST RATIO: A CHARACTERIZATION VIA 
GRAPH MINORS 

Our path-finding model naturally motivates a basic quantity 
of interest: the cost ratio, defined as the ratio between the 
cost of the path found by the agent and the cost of the short- 
est path. We work here within the fixed-goal version of the 
model, in which the agent is required to reach the goal t and 
the objective is to minimize the cost of the path used. 

To fix notation for this discussion, given a directed acyclic 
graph G on 7 nodes with positive edge costs, we let d(v, w) 
denote the cost of the shortest v-w path in G (using the true 
edge costs, not modified by present bias). Let P (v, t) denote 
the the v-t path followed by an agent with present-bias 3, and 
let c (v, t) be the total cost of this path. The cost ratio can thus 
be written as c (s, t)/d(s, t). 


A bad example for the cost ratio 

We first describe a simple construction showing that the 
cost ratio can be exponential in the number of nodes n. We 
then move on to the main result of this section, which is a 
characterization of the instances in which the cost ratio 
achieves this exponential lower bound. 

Our construction is an adaptation of the Akerlof example 
from the introduction. We have a graph that consists of a 
directed path s = V V, V,» V „ and with each v, also linking 
directly to node t. (The case n = 5 is the graph in Figure 2(a).) 
With b = 3", we choose any u < b; we let the cost of the edge 
M, t) be p’, and let the cost of each edge M, v, be 0. 

Now, when the agent is standing at node v, it evaluates 
the cost of going directly to t as p’, while the cost of the two- 
step path through v, to t is evaluated as 0 + Bw = (Bu) 
< p. Thus the agent will follow the edge (v, v.) with the 
plan of continuing from v,,, to t. But this holds for all j, so 
once it reaches v, it changes its mind and continues on to 
Vio and so forth. Ultimately it reaches v,, and then must go 
directly to ¢ at a cost of c (s, t) = p”. Since d(s, t) = 1 by using 
the edge directly from s to t, this establishes the exponential 
lower bound on the cost ratio c (s, t)/d(s, t). Essentially, this 
construction shows that the Akerlof example can be made 
quantitatively much worse than its original formulation by 
having the cost of going directly to the goal grow by a mod- 
est constant factor in each time step; when a present-biased 
agent procrastinates in this case, it ultimately incurs an 
exponentially large cost. 

The following observation establishes this is the highest 
possible cost ratio 


OBSERVATION 3.1. Consider an agent currently at v and let u be 
the next node on P (v, t). Then d(u, t) < b - div, t). 


PROOF. If u is on the shortest path from v to ¢ then clearly 
the claim holds. Else, since the agent chose to continue to u 
instead of continuing on the shortest path from v to t we have 
c(v, u) + Bd(u, t) < d(v, t). This implies that d(u, t) < b - d(v, t). 


The observation essentially implies that with each step 
that the agent takes the cost of the path that it plans to take 
increases by an extra factor of at most b relatively to d(s, t). 
Hence, we have a tight upper bound on the cost ratio: 


COROLLARY 3.2. The cost ratio for a graph G with n nodes is at 
most b”. 


A graph minor characterization 

We now provide a structural description of the graphs on 
which the cost ratio can be exponential in the number of 
nodes n—essentially we show that a constant fraction of the 
nodes in such a graph must have the structure of the Akerlof 
example. 

We make this precise using the notion of a minor. Given 
two undirected graphs H and K, we say that H contains a 
K-minor if we can map each node « of K to a connected sub- 
graph Sin H, with the properties that (i) S_ and S „ are dis- 
joint for every two nodes k, x’ of K, and (ii) if (x, «’) is an edge 
of K, then in H there is some edge connecting a node in S, to 
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a node in S. Informally, the definition means that we can 
build a copy of K using the structure of H, with disjoint con- 
nected subgraphs of H playing the role of “super-nodes” that 
represent the nodes of K, and with the adjacencies among 
these super-nodes representing the adjacencies in K. The 
minor relation shows up in many well-known results in graph 
theory, perhaps most notably in Kuratowski’s Theorem that 
a nonplanar graph must contain either the complete graph 
K, or the complete bipartite graph K, , as a minor.° 

Our goal here is to show that if G has exponential cost 
ratio, then its undirected version must contain a large copy 
of the graph underlying the Akerlof example as a minor. In 
other words, the Akerlof example is not only a way to pro- 
duce a large cost ratio, but it is in a sense an unavoidable 
signature of any example in which the cost ratio is very large. 

We set this up as follows. Let o(G) denote the skeleton 
of G, the undirected graph obtained by removing the direc- 
tions on the edges of G. Let F, denote the graph with nodes 
V,,V,)++-,V,,andw, andedges(v,,v, ,)fori=1,...,k-1,and(v,,w) 
fori=1,...,k. We refer to F, as the k-fan. 

We now claim 


THEOREM 3.3. For every à > 1, ifn is sufficiently large and the 
cost ratio is greater than à", then o(G) contains an F -minor for 
k=O(n). 

Proof sketch.° We now provide a sketch of the proof. The 
basic idea is to pin down k = O(n) nodes, v,,..., v,, on the path 
P (s, t) and let P be the portion of the path from s to v,. We 
show that from each vy, the shortest path Q, intersects with P 
only at v,. This is illustrated in Figure 3. 

Once we have this structure, we obtain an F, minor by par- 
titioning P into segments such that each v, is in a distinct seg- 
ment, and then we contract each segment. Since all the Q’s 
are connected to ¢, we can contract them together (without the 
v/s). Finally, as all segments of P are connected to one another, 
and each segment is connected to t by Q, we get a fan F. 

For simplicity we normalize the edges costs such that 
d(s, t) = 1. We choose the nodes v,, ..., v, such that for every 
i, v, is the last node on the path P such that d(v, t) < b'i. With 
this choice it is not hard to show that the shortest path from 
v, to t (Q) intersects with P only at v, In particular, Q, can not 
intersect with P at any node before v, since this will create 


€ This sketch is based on the full proof in our original paper, and also draws 
on the exposition in Tim Roughgarden’s lecture notes about our paper.” 


Figure 3. The construction of an F,-minor in Theorem 3.3. 
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a directed cycle and our graph is a directed acyclic graph. 
Furthermore, Q, can not intersect with P at any node u after 
v, since by our choice of v „we have d(u, t) > b' > d(v,, t) for every 
node u after v,. 

Finally, we should show that indeed for every 1 < i < k, 
there exists a distinct node v, with the property that (i) dv, t) < b’, 
and (ii) for any node u after v, on P £s, t), we have d(u, t) > bi. 
First observe that since the cost ratio is à” and the length of 
the path is at most n, the path must contain an edge (u, v) of 
cost at least \"/n. Roughly speaking, since n is large enough 
there exists A, > 1 such that \"/n=.}. In particular this 
implies that d(u,t)2 A, By Observation 3.1 we have that with 
each step on the path P the cost of the shortest path can 
increase by at most a factor of b. Thus, there exist k nodes 
as required. 

After the initial publication of our original paper, Tang 
et al.” extended Theorem 3.3 as follows: 


THEOREM 3.4. If the cost ratio is greater than b**, then o(G) 
contains an F -minor. 


Both Theorem 3.3 and its tighter counterpart Theorem 
3.4 offer some qualitatively relevant advice for thinking 
about the structure of complicated tasks: to avoid creat- 
ing inefficient behavior due to present bias, it is better to 
organize tasks so that they do not contain large fan-like 
structures. The point to appreciate is that such fan-like 
structures are not purely graph-theoretic abstractions; 
they arise in real settings whenever a task has a series of 
“branches” (as in Akerlof’s story) that allows an agent to 
repeatedly put off completing the task. The theorems are 
a way of formalizing the idea that such sets of repeated 
branches are the crux of the reason why present-biased 
individuals incur unnecessary inefficiency in complet- 
ing large tasks. And correspondingly, organizing tasks 
in a way that breaks this type of branching—for example, 
with intermediate deadlines or subgoals—can be a way of 
reducing inefficiency. 


4. COLLECTIONS OF HETEROGENEOUS AGENTS 

Thus far we have focused on the behavior of a single agent 
with a given present-bias parameter 3. Now we consider 
all possible values of 3, and ask the following basic ques- 
tion: how large can the set {P (s, t) : 8 € [0, 1]} be? In 
other words, if for each 6, an agent with parameter ( were 
to construct an s-t path in G, how many different paths 
would be constructed across all the agents? Bounding this 
quantity tells us how many genuinely “distinct” types of 
behaviors there are for the instance defined by G. Let P(G) 
denote the set {P,(s, t): 6 € [0, 1]}. Despite the fact that 8 
comes from the continuum [0, 1], the set P(G) is clearly 
finite, since G only has finitely many s-t paths. The ques- 
tion is whether we can obtain a nontrivial upper bound on 
the size of P(G), and in particular one that does not grow 
exponentially in the number of nodes n. In fact this is pos- 
sible, and our main goal in this section is to prove the fol- 
lowing theorem. 


THEOREM 4.1. For every directed acyclic graph G, the size of 


P(G) is O(n’). Moreover, there exists a graph for which the size 
of P(G) is Q(n’). 

Proof idea. We use the following procedure to “dis- 
cover” all the paths in P(G). We start by taking 8 = 0 and let 
P, the path that the agent with £ = 0 takes. Now, we gradu- 
ally increase till we reach 8* such that an agent with 8* will 
take a different path. We claim that there is at least one edge 
(Q, u) in P, that will not take part in any path that an agent 
with 6 > * will take. More generally, each time that we dis- 
cover a new path, we essentially delete at least one edge from 
the graph. Hence the number of paths in P(G) is bounded by 
the number of edges in the graph. 

Theorem 4.1 tells us that the effect of present-bias on the 
path that an agent takes is in some sense limited as the spe- 
cific value of 8 an agent has determines which path will the 
agent take from a precomputed set of at most O(n’) different 
paths. Since this O(n’) quantity is much smaller in general 
than the full set of possible paths, it says that the possible 
heterogeneity in agent behavior based on different levels 
of present bias is not as extensive as it might initially seem. 
Furthermore, quantifying this heterogeneity is a first step 
in designing efficient task graphs for populations of agents 
that are heterogeneous. 


5. MOTIVATING AN AGENT TO REACH THE GOAL 

We now consider the version of the model with rewards: 
there is a reward at t, and the agent has the additional option 
of quitting if it perceives—under its present-biased evalua- 
tion—that the value of the reward is not worth the remain- 
ing cost in the path. 

Note that the presence of the reward does not affect the 
agent’s choice of path, only whether it continues along the 
path. Thus we can clearly determine the minimum reward r 
required to motivate the agent to reach the goal in G by sim- 
ply having it construct a path to t according to our standard 
fixed-goal model, identifying the node at which it perceives 
the remaining cost to be the greatest (due to present bias 
this might not be s), and assigning this maximum perceived 
cost as a reward at t. 

A more challenging question is suggested by the possi- 
bility of deleting nodes and edges from G; recall that Figure 
2(b) showed a basic example in which the instructor of a 
course was able to motivate a student to finish the course- 
work by deleting a node from the underlying graph. (This 
deletion essentially corresponded to introducing a deadline 
for the first piece of work.) This shows that even if the reward 
remains fixed, in general it may be possible for a designer 
to remove parts of the graph, thereby reducing the set of 
options available to the agent, so as to get the agent to reach 
the goal. We now consider the structure of the subgraphs 
that naturally arise from this process. 


Motivating subgraphs: A fundamental example 

The basic set-up we consider is the following. Suppose the 
agent in the reward model is trying to construct a path from 
s to t in G; the reward r is not under our control—perhaps it 
is defined by a third party, or represents an intrinsic reward 
that we can not augment—but we are able to remove nodes 
and edges from the graph (essentially by declaring certain 


activities invalid, as the deadline did in Figure 2(b)). Let us 
say that a subgraph G’ of G motivates the agent if in G' with 
reward r, the agent reaches the goal node t. (We will also 
refer to G' as a motivating subgraph.) Note that it is possible 
for the full graph G to be a motivating subgraph. 

It would be natural to conjecture that if there is any sub- 
graph G’ of G that motivates the agent, then there is a moti- 
vating subgraph consisting simply of an s-t path P that the 
agent follows. In fact this is not the case. Figure 4 shows a 
graph illustrating a phenomenon that we find somewhat 
surprising a priori, though not hard to verify from the exam- 
ple. In the graph G depicted in the figure, an agent with 
@ = 1/2 will reach the goal t. However, there is no proper 
subgraph of G in which the agent will reach the goal. The 
point is that the agent starts out expecting to follow the path 
s-a-b-t, but when it gets to node a it finds the remainder 
of the path a-b-t too expensive to justify the reward, and it 
switches to a-c-t for remainder. With just the path s-a-b-t in 
isolation, the agent would get stuck at a; and with just s-a-c-t, 
the agent would never start out from s. It is crucial the agent 
mistakenly believes the upper path is an option in order to 
eventually use the lower path to reach the goal. 

It is interesting, of course, to consider real-life analogues 
of this phenomenon. In some settings, the structure in 
Figure 4 could correspond to deceptive practices on the part 
of the designer of G—in other words, inducing the agent to 
reach the goal by misleading them at the outset. But there 
are other settings in real life where one could argue that the 
type of deception represented here is more subtle, not any 
one party’s responsibility, and potentially even salutary. For 
example, suppose the graph schematically represents the 
learning of a skill such as a musical instrument. There’s 
the initial commitment corresponding to the edge (s, a), 
and then the fork at a where one needs to decide whether 
to “get serious about it” (taking the expensive edge (a, b)) or 
not (taking the cheaper edge (a, c)). In this case, the agent’s 
trajectory could describe the story of someone who derived 
personal value from learning the violin (the lower path) 
even though at the outset they believed incorrectly that they 
would be willing to put the work into becoming a concert 
violinist (the upper path). 


The structure of minimal motivating subgraphs 

Given that there is sometimes no single path in G that is 
motivating, how rich a subgraph do we necessarily need to 
motivate the agent? Let us say that a subgraph G* of G is a 
minimal motivating subgraph if (i) G* is motivating, and (ii) 
no proper subgraph of G* is motivating. Thus, for example, 


Figure 4. A minimal subgraph for getting an agent to reach t. 
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in Figure 4, the graph G is a minimal motivating subgraph of 
itself; no proper subgraph of G is motivating. 

Concretely, then, we can ask the following question: what 
can a minimal motivating subgraph look like? For example, 
could it be arbitrarily dense with edges? 

In fact, minimal motivating subgraphs necessarily have 
a sparse structure, which we now describe in our next theo- 
rem. To set up this result, we need the following definition. 
Given a directed acyclic graph G and a path P in G, we say that 
a path Qin G is a P-bypass if the first and last nodes of Q lie on 
P, and no other nodes of Q do; in other words, P N Q is equal 
to the two ends of Q. We now have 


THEOREM 5.1. If G* is a minimal motivating subgraph, then it 
contains an s-t path P* with the properties that 


(i) Every edge of G* is either part of P* or lies on a 
P*-bypass in G*; and 

(Gii) Every node of G* has at most one outgoing edge that 
does not lie on P*. 


Proof sketch. Roughly speaking, there are two types 
of edges that should be included in a minimal motivating 
subgraph: edges that the agent will actually take (these 
are the edges of the path P*) and edges that at some point 
the agent (wrongly) believes that it will take (these are 
the edges on the P*-bypasses). It is clearly the case that 
an edge e that the agent never plans to take can be safely 
removed from the graph without affecting the agent’s 
decisions. Furthermore, since all the bypass edges are 
only used in shortest path computations it is impossible 
for a node v in P* to have two neighbors w, and w, not on 
P* such that the agent at some v, on P* plans to follow a 
path that includes the edge (v, w,) and an agent at some v, 
on P* plans to follow a path that includes the edge (v, w,). 
This is simply because if (v, w,) + d(w,, t) < (v, w,) + d(w,, t) 
the agent will choose the path that contains (v, w,) both 
when standing at v, and at v, and otherwise it will choose 
the path that contains (v, w,) in both cases. 

After the publication of our original paper, Tang et al.”° 
and Albers and Kraft” independently showed that determin- 
ing whether a task graph admits a motivating subgraph is 
NP-complete. One way of circumventing these hardness 
results is to identify task graphs that are more common in 
practice and asking whether on these graphs the motivat- 
ing subgraph problem can be solved in polynomial time. 
Alternatively, we can consider approximation algorithms. 
Albers and Kraft? considered a variant of this question: 
what is the minimum reward r for which a motivating sub- 
graph exists? They showed that this problem can not be 
approximated by a factor better than Vn /3, and they pre- 
sented a (Vn+1) approximation algorithm. Interestingly, 
the subgraph achieving the Jn +1 approximation is a path. 
Surprisingly, in a different paper, Albers and Kraft? were 
able to break the Vn barrier and presented a 2-approxima- 
tion algorithm for a more powerful designer who is able to 
increase the costs of edges. 

An alternative way to motivate an agent to reach t is to 
place intermediate rewards on specific nodes or edges; the 
agent will claim each reward if it reaches the node or edge 
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on which it is placed. Now the question is to place rewards 
on the nodes or edges of an instance G such that the agent 
reaches the goal t while claiming as little total reward as pos- 
sible; this corresponds to the designer’s objective to pay out 
as little as possible while still motivating the agent to reach 
the goal. Following our paper, Albers and Kraft? and Tang 
etal.” studied different versions of this question and showed 
that the problem of assigning intermediate rewards in an 
optimal way is NP-complete. A version worth mentioning is 
the one in which the designer only cares about minimizing 
the rewards that the agent actually takes. Such a formulation 
of the problem comes with the danger that a designer would 
be able to create “exploitive” solutions in which the agent is 
motivated by intermediate rewards that it will never claim, 
because these rewards are on nodes that the agent will never 
reach. 


6. CONCLUSION AND SUBSEQUENT WORK 

We have developed a graph-theoretic model in which an 
agent constructs a path from a start node s to a goal node 
t in an underlying graph G representing a sequence of 
tasks. Time-inconsistent agents may plan an s-t path that 
is different from the one they actually follow, and this type 
of behavior in the model can reproduce a range of qualita- 
tive phenomena including procrastination, abandonment 
of long-range tasks, and the benefits of a reduced set of 
options. Our results provide characterizations for a set of 
basic structures in this model, including for graphs achiev- 
ing the highest cost ratios between time-inconsistent agents 
and shortest paths, and we have investigated the structure 
of minimal graphs on which an agent is motivated to reach 
the goal node. 

There is a wide range of broader issues for further work. 
These include finding structural properties beyond our 
graph-minor characterization that have a bearing on the 
cost ratio of a given instance; obtaining a deeper under- 
standing of the relationship between agents with different 
levels of time-inconsistency as measured by different values 
of 3; and developing algorithms for designing graph struc- 
tures that motivate effort as efficiently as possible, includ- 
ing for multiple agents with diverse time-inconsistency 
properties. 

In work following the initial appearance of our paper, 
the results were extended in several subsequent lines of 
research. We have discussed several of these further results 
in the text thus far, including a tighter graph-minor char- 
acterization of instances with high cost ratio,” and a set of 
hardness results and approximation algorithms for moti- 
vating subgraphs.” >° Further results beyond these have 
considered the behavior of different types of agents. Gravin 
et al.® studied the behavior of a present-biased agent whose 
present-bias parameter is not fixed over time; rather, after 
each step it re-samples the parameter from some fixed dis- 
tribution. In Kleinberg et al.,'° the authors of the present 
paper together with Manish Raghavan studied the behavior 
of sophisticated agents (building on a formalization from 
O’Donoghue and Rabin"), who are aware of their present 
bias and can take it into account in their planning. Such 
agents are not able to ignore or disregard their present 
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Technical Perspective 


On Heartbleed: 


A Hard Beginnyng Makth a Good Endyng 


JOHN HEYWOOD (1497-1580) 
By Kenny Paterson 


THE SSL/TLS PROTOCOL Suite has become the 
de facto secure protocol for communi- 
cations on the Web, protecting billions 
of communications sessions between 
browsers and servers on a daily basis. We 
use it every time we access our social me- 
dia feeds, or whenever an app running 
on our mobile device wants to contact its 
home server. It has become an almost in- 
visible part of the Web’s security infra- 
structure, supported by an eclectic mix of 
technologies including public key cryp- 
tography, certificates, and the Web PKI. 
So when a serious security vulnera- 
bility is discovered in the SSL/TLS pro- 
tocol itself, or in one of the main imple- 
mentations like OpenSSL, one would 
naturally expect a rapid response—sys- 
tem administrators would roll into ac- 
tion, patching their software as quickly 
as possible, and taking any other reme- 
dial actions that might be necessary. 
The following paper by Zhang et al. 
paints a very different picture in the con- 
text of the most famous SSL/TLS vulner- 
ability of all, Heartbleed. The Heartbleed 
vulnerability resides in the OpenSSL im- 
plementation of the Heartbeat protocol. 
The Heartbeat protocol is an extension 
of SSL/TLS for checking the “liveness” of 
a connection. The Heartbleed bug lay in 
OpenSSL’s failure to correctly perform 
bounds checking when processing 
Heartbeat messages. An attacker, situat- 
ed anywhere on the planet, could induce 
a server to return large amount of data to 
the attacker from arbitrary (but uncon- 
trolled) portions of its stack. This memo- 
ry leak would allow the attacker to learn a 
vulnerable server’s private key, with di- 
sastrous security consequences. 
Heartbleed became public in early 
April 2014. The Internet community rap- 
idly developed free Heartbleed scanning 
services, and published statistics on 
vulnerable servers. Some websites were 
actually attacked, though all hosts in 
the Alexa top 500 were patched within 
48 hours. As well as immediately patch- 
ing OpenSSL to remove the vulnerability, 
security experts recommended that sys- 
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tem administrators should revoke their 
public key certificates, generate new 
key pairs, and request their Certifica- 
tion Authorities (CAs) to issue new 
certificates. Private keys, among the 
security “crown jewels” for SSL/TLS, 
had potentially been compromised. 

The following paper examines to 
what extent revocation and reissuance 
of certificates happened post-Heart- 
bleed. The short and surprising answer: 
not so much. Zhang etal. cleverly use the 
Heartbleed incident as an opportunity 
to perform a natural experiment, track- 
ing the rate at which large websites from 
the Alexa top 1M chose to revoke and re- 
issue certificates. They carefully assess 
the extent to which those sites would 
have been vulnerable to Heartbleed, and 
then use the open nature of the Web to 
collect information about when those 
sites’ certificates were actually changed, 
and whether the previous certificates 
were properly revoked. To a security- 
conscious reader, the final statistics 
make depressing reading. For example, 
of roughly 107,712 vulnerable websites 
(that is, websites for which the private 
key could have been exposed), only 
26.7% had reissued certificates by the 
end of April 2014, while 60% of those 
sites did not properly revoke their old 
certificates (meaning that, had the cor- 
responding private keys been exposed, 
the sites would still be vulnerable). 

The authors discuss some of the 
reasons why such low rates of revoca- 
tion and reissuance were seen. They 
also report anecdotal evidence 
gleaned from surveying system ad- 
ministrators. A key reason is the pro- 
cesses are manual rather than being 
automated. One may also advance the 
argument that customers pay CAs for 
certificates, and security budgets are 
always under pressure. One possible 
perception is that the window of expo- 
sure was small, because most vulnera- 
ble systems were patched quickly. This 
belies the fact that the vulnerability 
was present in the OpenSSL code for 
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more than two years, and could have 
been exploited during that time. A third 
possible reason is ignorance on the 
part of sysadmins: while patching is 
part of the everyday sysadmin culture, 
dealing with certificates and keys is 
not. The paper points out a more sys- 
tematic study is needed in order to un- 
derstand certificate management and 
how it is tackled by sysadmins. 

More broadly, the Heartbleed inci- 
dent has had a lasting and net-positive 
impact on the security of the Web. The 
bug led to a much closer inspection of 
the OpenSSL code base, leading to oth- 
er problems being subsequently dis- 
covered. It also led to a wider debate 
about the wisdom of having a security 
monoculture, about the quality of the 
OpenSSL code, and about the way in 
which the OpenSSL project was being 
run. And it triggered a debate about the 
responsibilities of large companies 
who make free use of open source soft- 
ware like OpenSSL without contribut- 
ing materially to its development. 

Heartbleed resulted in major indus- 
try players forming the Core Infrastruc- 
ture Initiative, a project intended to 
fund critical elements of the global in- 
formation infrastructure. OpenSSL has 
in turn significantly revised its opera- 
tions, expanded the development team, 
and heavily refactored the codebase. 

In parallel, Google initiated its own 
fork of OpenSSL called BoringSSL. The 
naming is considered, with “boring” 
being intended to convey an impres- 
sion of “no nasty surprises.” In October 
2015, Google announced it had 
switched over to BoringSSL across its 
entire codebase (comprising several 
billion lines of code). 

The “Let’s Encrypt” initiative has 
started to address the problem of help- 
ing sysadmins to better manage certifi- 
cates. Let’s Encrypt is a free CA, which 
simplifies the process of obtaining cer- 
tificates and switching on SSL/TLS for 
websites. In mid-2017, its 100-millionth 
certificate was issued, and Mozilla’s te- 
lemetry indicated more than half of all 
HTTP connections were protected. 
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Analysis of SSL Certificate 
Reissues and Revocations 
in the Wake of Heartbleed 


By Liang Zhang, David Choffnes, Tudor Dumitras, Dave Levin, Alan Mislove, Aaron Schulman, and Christo Wilson 


Abstract 

A properly managed public key infrastructure (PKI) is 
critical to ensure secure communication on the Internet. 
Surprisingly, some of the most important administrative 
steps—in particular, reissuing new X.509 certificates and 
revoking old ones—are manual and remained unstudied, 
largely because it is difficult to measure these manual pro- 
cesses at scale. 

We use Heartbleed, a widespread OpenSSL vulnerability 
from 2014, as a natural experiment to determine whether 
administrators are properly managing their certificates. All 
domains affected by Heartbleed should have patched their 
software, revoked their old (possibly compromised) certifi- 
cates, and reissued new ones, all as quickly as possible. We 
find the reality to be far from the ideal: over 73% of vulnerable 
certificates were not reissued and over 87% were not revoked 
three weeks after Heartbleed was disclosed. Our results 
also show a drastic decline in revocations on the weekends, 
even immediately following the Heartbleed announcement. 
These results are an important step in understanding the 
manual processes on which users rely for secure, authenti- 
cated communication. 


1. INTRODUCTION 

Server authentication is the cornerstone of secure com- 
munication on the Internet; it is the property that allows 
client applications such as online banking, email, and 
e-commerce to ensure the servers with whom they com- 
municate are truly who they say they are. In practice, 
server authentication is made possible by the globally 
distributed Public Key Infrastructure (PKI). The PKI lever- 
ages cryptographic mechanisms and X.509 certificates to 
establish the identities of popular websites. This mecha- 
nism works in conjunction with other network protocols— 
particularly Secure Sockets Layer (SSL) and Transport Layer 
Security (TLS)—to provide secure communications, but the 
PKI plays a key role: without it, a browser could establish 
a secure connection with an attacker that impersonates a 
trusted website. 

The secure operation of the web’s PKI relies on respon- 
sible administration. When a software vulnerability is dis- 
covered, administrators must act quickly and deploy the 
patch to prevent attackers from exploiting the vulnerability. 
Similarly, after a potential key compromise, website admin- 
istrators must revoke the corresponding certificates to pre- 
vent attackers from intercepting encrypted communications 


between browsers and servers. A recent study suggests 
0.2% of SSL connections to Facebook correspond to such 
man-in-the-middle attacks." After considerable research 
into understanding and improving the speed at which 
software is patched,” much of software patching has 
become automated. However, the web’s PKI requires a sur- 
prising amount of manual administration. To revoke a cer- 
tificate, website administrators must send a request to their 
Certificate Authority (CA), and this request may be manually 
reviewed before the certificates are finally added to a list that 
browsers (are supposed to) check. Such operations occur at 
human timescales (hours or days) instead of computer ones 
(seconds or minutes). An important open question is: when 
private keys are compromised, how long are SSL clients 
exposed to potential attacks? 

Historically, these manual processes have been difficult 
to measure: how can one measure, at scale, how long these 
processes take if we do not know how often, or precisely 
when, administrators realize their keys are compromised? 
In this paper, we use a widespread security vulnerability 
from 2014, Heartbleed, as anatural experiment: the moment 
Heartbleed was announced, all administrators of vulner- 
able servers should have initiated their manual processes 
as quickly as possible.* This natural experiment allows us 
to measure at scale the manual administration of the web’s 
PKI. In particular, this paper focuses on the response to 
the public announcement of Heartbleed, in terms of how 
quickly certificates were reissued and whether or not the 
certificates were eventually revoked. 

Our results expose incomplete and slow administrative 
practices that ultimately weaken the security of today’s PKI. 
On the positive side, we also identify ways in which the PKI 
can be strengthened. Our hope is that, through better under- 
standing how the PKI operates in practice, the security and 
research community can take concrete steps toward improv- 
ing this system on which virtually all Internet users rely. 


2. BACKGROUND 

In this section, we review the relevant background of SSL/ 
TLS and the PKI, and we describe the Heartbleed vulnerabil- 
ity that serves as our natural experiment. 


The original version of this paper was published in 
the 2014 ACM Internet Measurement Conference (IMC’14). 
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2.1. Certificates 
One of the critical components of the PKI is a certificate: 
a signed attestation binding a human-understandable 
subject (a domain name or business name) to a public key. 
Certificates are signed by a CA, who in turn has its own cer- 
tificate, etc., forming a logical chain that terminates at a 
self-signed root certificate. By issuing a certificate, a CA is 
essentially asserting “this subject is the sole owner of the 
private key corresponding to this certificate’s public key.” 
Thus, if someone can prove knowledge of the private key 
(e.g., by using it to sign a message), then this proves that 
this someone is the subject. As a result, anyone who has 
knowledge of the private key can pretend to be that subject. 

The assumption that only legitimate subjects hold the 
corresponding private keys is central to the PKI’s ability to 
authenticate servers. Unfortunately, keys can be compro- 
mised. For example, software vulnerabilities in SSL imple- 
mentations have resulted in predictable keys” or the ability 
to read sensitive server-side data.° 

When aprivate key is compromised, a responsible admin- 
istrator must do at least three things: first, the administrator 
must patch the vulnerable software. Second, because the 
certificate’s private key has been compromised, the admin- 
istrator must generate a new key pair and ask their CA to reis- 
sue a new certificate with this new private key. However, the 
old, compromised certificate would still exist, and it could 
be used by a malicious party to undetectably impersonate 
the website. Thus, there is a critical third step an admin- 
istrator must do in response to a key compromise: revoke 
the old certificate. It is important to note that the final two 
steps necessarily involve the CA, and many CAs charge for 
these actions, which may lead to perverse incentives for site 
owners. 


2.2. Certificate reissue 

When a website stops using a certificate—for instance, 
because the certificate has been compromised, or because 
it expired—they must obtain a new certificate. This proc- 
ess is referred to as reissuing the certificate. To do so, the 
system administrator must contact the CA who signed their 
certificate and request a new certificate and signature. In 
cases where the private key may have been compromised, 
the administrator should also choose a new public/private 
key pair (since reissuing the certificate with the old public 
key does nothing to mitigate attacks that leverage the leaked 
private key). 


2.3. Certificate revocation 
A certificate revocation is another signed attestation from a 
CA, which essentially states “this certificate should no lon- 
ger be considered valid.” CAs are responsible for making 
revocations available for download, and typically do so with 
Certificate Revocation Lists (CRLs). Browsers (are supposed 
to) download CRLs to check if a presented certificate has 
been revoked; the longer an administrator waits to revoke 
compromised certificates, the longer users are susceptible 
to man-in-the-middle attacks. 

The PKI uses a default-valid model, where potentially 
compromised certificates remain valid until their expiration 
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date or until they are revoked. The security of any PKI is thus 
critically dependent on the timeliness of certificate revoca- 
tions. However, requesting a revocation is a surprisingly 
manual process, typically requiring an administrator to visit 
a website, fill in a form, provide a reason for the revocation, 
and wait for a representative at the CA to manually inspect 
the request before issuing the revocation. 

While it seems natural to assume that certificates are reis- 
sued at precisely the moment the old certificate is revoked, 
in fact today’s PKI protocols make no such requirement. 
As our study will demonstrate, reissues can happen before, 
during, or after a revocation—or even without revoking the 
old certificate at all. To the best of our knowledge, we are the 
first to correlate revocations with reissues. 


2.4. Heartbleed 

Heartbleed is a buffer over-read vulnerability discovered in 
OpenSSL” versions 1.0.1 (released March 14, 2012) through 
1.0.1f. The vulnerability stems from a bug in OpenSSL’s 
implementation of the TLS Heartbeat Extension.” The 
intended functionality of TLS Heartbeat is to allow a cli- 
ent to test a secure communication channel by sending a 
“heartbeat” message consisting of a string and the 16-bit 
payload_length of this string. Unfortunately, vulner- 
able OpenSSL versions fail to check that the payload_ 
length supplied by the client matches the length of the 
provided string. This allows a malicious client to craft a 
heartbeat message containing a 1-byte string and 2'°- 1 as 
the payload_length. In this case, OpenSSL will allocate 
a 64KB block of heap memory, memcpy () 64KB of data 
into it, starting with the 1-byte string, and finally send the 
contents of the entire buffer to the client. This allows the 
malicious client to read up to 2° — 2 bytes of the server’s 
heap memory, although the client cannot chose which 
memory is read. 

By repeatedly exploiting Heartbleed, an attacker can 
extract sensitive data from the server, including SSL pri- 
vate keys.” To make matters worse, OpenSSL does not log 
heartbeat messages, giving attackers free reign to undetect- 
ably exploit Heartbleed. Given the severity and undetectable 
nature of Heartbleed, site operators were urged to immedi- 
ately update their OpenSSL software and revoke and reissue 
their certificates.’ 

Why study Heartbleed? Heartbleed was first discovered 
by Neel Mehta from Google on March 21, 2014. On April 7, 
the bug became public and the OpenSSL project released a 
patched version (1.0.19) of the OpenSSL library.® 

The significance of this timeline, and of Heartbleed 
in general, is that it represents a point in time after which 
the administrators of all vulnerable servers should have (1) 
patched their server, (2) revoked their old certificate, and 
(3) issued a new one. The scope of this vulnerability—it is 
estimated that up to 17% of all HTTPS web servers were 
vulnerable'*’—makes it an ideal case study for evaluating 
large-scale properties of SSL security in the face of private 
key compromise. As a result, Heartbleed acts as a natural 
experiment, allowing us to measure how completely and 
quickly administrators took steps to secure their keys. While 
such events are (sadly) not uncommon,” the intense press 


coverage surrounding Heartbleed reduces the likelihood 
that administrators failed to take action because they were 
unaware of the vulnerability. 


3. DATA AND METHODS 

We now describe the data sets that we collected and our 
methodology for determining a host’s SSL certificate, when 
it was in use, if and when the certificate was revoked, and 
if the host was (or is still) vulnerable to the Heartbleed bug. 


3.1. Certificate data source 
We obtain our collection of SSL certificates from (roughly) 
weekly scans of the entire IPv4 address space made available 
by Rapid7.'° We use scans collected between October 30, 
2013 and April 28, 2014. There are a total of 28 scans during 
this period, giving an average of 6.7 days (with a minimum 
of 3 days and maximum of 9 days) between successive scans. 
The scans found an average of 26.9 million hosts respond- 
ing to SSL handshakes on port 443 (an average of 9.12% of 
the entire IPv4 address space). Across all of the scans, we 
observed a total of 19,438,865 unique certificates (includ- 
ing all leaf and CA certificates). In the sections below, we 
describe how we filtered and validated this data set; an over- 
view of the process is provided in Figure 1. 


3.2. Filtering data 

To focus on web destinations that are commonly accessed 
by users, we use the Alexa Top-1M domains’ as observed on 
April 28, 2014. We first extract all leaf (non-CA) certificates 
that advertise a Common Name (CN) that is in one of the 
domains in the Alexa list (e.g., we would include certificates 
for facebook.com, www.facebook.com, as well as *.dev.face- 
book.com). This set represents 1,573,332 certificates (8.1% 
of all certificates). 

Unfortunately, despite leaf certificates having a CN in the 
Alexa list, many may not be valid (e.g., expired certificates, 
forged certificates, certificates signed by an unrecognized 
root, etc.). We removed these invalid certificates* by running 
openssl verify on each certificate (and its correspond- 
ing chain). We configure OpenSSL to trust the root CA certifi- 
cates included by default in the OS X 10.9.2 root store’’; this 
includes 222 unique root certificates. 


After validation, we are left with 628,692 leaf certificates 
(40.0% of all certificates advertising Alexa domains and 3.2% 
of all certificates). We refer to this set of certificates as the 
Leaf Set, each of which has a valid chain. We refer to the set 
of all CA certificates on these chains (not including the leaf 
certificates) as the CA Set, which contains 910 unique certifi- 
cates. The Leaf Set certificates cover 166,124 (16.6%) of the 
Alexa Top-1M domains. This is the set of certificates and 
chains that we use in the remainder of the paper. 


3.3. Collecting CRLs 

To determine if and when certificates were revoked, we 
extracted the CRL URLs out of all Leaf Set certificates. We 
found 626,659 (99.7%) of these certificates to include at 
least one well-formed, reachable CRL URL. For certificates 
that included multiple CRL URLs, we included them all. We 
found a total of 1,386 unique CRL URLs (most certificates 
use a unified CRL provided by the signing CA, so the small 
number of CRLs is not surprising). We downloaded all of 
these CRLs on May 6, 2014, and found 45,268 (7.2%) of the 
Leaf Set certificates to be revoked. 

We also collected the CRL URLs for all certificates in the 
CA set. We found that 884 (97.1%) of the certificates in the 
CA Set included a reachable CRL; the union of these URLs 
comprised 246 unique reachable URLs. We downloaded 
these CRLs on May 6, 2014, as well. We found a total of seven 
CA certificates that were revoked, which invalidated 60 cer- 
tificates in the Leaf Set (< 0.01%). 


3.4. Inferring the Heartbleed vulnerability 
Finally, we wish to determine if a site was ever vulnerable to 
the Heartbleed OpenSSL vulnerability (and if it continued 
to be vulnerable at the end of the study). Doing so allows 
us to reason about whether the site operators should have 
reissued their SSL certificate(s) and revoked their old one(s). 
Determining if a host is currently vulnerable to Heartbleed 
is relatively easy, as one can simply send an improperly-for- 
matted SSL heartbeat message with a payload_length of 
0 to test for vulnerability without exfiltrating any data.° 
However, determining if a site was vulnerable in the 
past—but has since updated their OpenSSL code—is more 
challenging. We observe that only three of the common TLS 


Figure 1. Workflow from raw scans of the IPv4 address space to valid certificates (and corresponding CRLs) from the Alexa Top-1M domains. 
The Rapid7 data after February 5, 2014 did not include the intermediate (CA) certificates, necessitating additional steps and data to perform 
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implementations have ever supported SSL Heartbeats”: 
OpenSSL,” GnuTLS,” and Botan.’ Thus, if a host supports 
the SSL Heartbeat extension, we know that it is running one 
of these three implementations. Botan is targeted for client- 
side TLS, and we know of no popular web server that uses the 
Botan TLS library. GnuTLS has support for the SSL Heartbeat 
extension, but it is not enabled by default. Furthermore, 
GnuTLS supports the Max Fragment Length SSL extension,’ 
which is enabled by default, while OpenSSL has never sup- 
ported this extension. Thus, if we observe a host that sup- 
ports SSL Heartbeat but not the Max Fragment Length, we 
declare that host to have been running a vulnerable version 
of OpenSSL. 

To collect the list of sites that were ever vulnerable to 
Heartbleed, we extracted the IP addresses in the April 28, 
2014 Rapid7 scan that were advertising a certificate with 
a CN in the Alexa Top-1M list. We found 5,951,763 unique 
IP addresses in this set. We then connected to these IP 
addresses on port 443, determined the SSL extensions that 
the host supported, and checked whether the host was still 
vulnerable to the Heartbleed vulnerability. 


4. ANALYSIS 
We now examine the collected SSL certificate data, begin- 
ning with a few definitions. 


4.1. Definitions 

We are concerned with the evolution of SSL certificates (i.e., 
when are new certificates created, old ones retired, etc.). To 
aid in understanding this evolution, we define the following 
notions: 

Certificate birth: We define the birth of an SSL certificate 
to be the date of the first scan where we observed any host 
advertising that certificate. 

Certificate death: Defining the death of a certificate is 
more complicated, as we observe a number of instances 
where many hosts advertise a given certificate, and then 
all but a few of the hosts switch over to a new certificate 
(presumably, the site intended to retire the old certifi- 
cate, but failed to update some of the hosts). To handle 
these cases, we calculate the maximum number of hosts 
that were ever advertising each certificate. We then define 
the death of an SSL certificate to be the last date that the 
number of hosts advertising the certificate was above 10% 
of that certificate’s maximum. The 10% threshold pre- 
vents us from incorrectly classifying certificates that are 
still widely available as dead, even if the certificate has 
been reissued. 

An example of certificate lifetime for m.scotrail.co.uk is 
shown in Figure 2. All hosts except one switch to a new certif- 
icate after February 10, 2014; this lone host finally switches 
on April 28, 2014. In this case, we would consider the death 
date of the old certificate to be February 10, 2014. 

Based on these definitions, we can now define the notion 
of a certificate reissue and revocation: 

Certificate reissue: We consider a certificate to be reissued 
if the following three conditions hold: (a) we observe the cer- 
tificate die, and (b) we observe a new certificate for the same 
Common Name born within 10 days of the certificate’s death, 
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and (c) we observe at least one IP address switch from the old 
certificate to the new. We define the date of the certificate 
reissue to be the date of the certificate’s death. For the sake 
of clarity, we refer to the old certificate that was replaced as 
the retired certificate. 

Certificate revocation: A certificate is revoked if its serial 
number appears in any of the certificate’s CRLs. The date of 
revocation is provided in the CRL entry. 

In Figure 3, we present the number of certificate births, 
deaths, reissues, and revocations per day over time (please 
note the y-axis is in log scale). Births almost always outnumber 
deaths, meaning that the total number of certificates in-the- 
wild is growing. Furthermore, we see an average of 29 certifi- 
cate revocations per day before Heartbleed; after Heartbleed, 
this jumps to an average of 1,414 revocations per day. 


4.2. Server patching 

We present a brief analysis of the number of certificates 
hosted by machines that were ever vulnerable to Heartbleed. 
Of the 428,552 leaf certificates that were still alive on the last 
scan, we observe 122,832 (28.6%) of them advertised by a 
host that was likely vulnerable to Heartbleed at some point 
in time. These certificates come from 70,875 unique Alexa 
Top-1M domains. Of these certificates, 11,915 (from 10,366 
unique domains) were on hosts that were still vulnerable 
at the time of our crawl (April 30, 2014). This result dem- 
onstrates that even in the wake of a well-publicized, severe 
security vulnerability, around 10% of vulnerable sites did not 
address the issue three weeks after the fact. 


Figure 2. Example of lifetime, for certificates for m.scotrail.co.uk. All 
hosts except one switch to a new certificate after February 10, 2014. 
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In Figure 4, we present the fraction of domains that have 
at least one SSL host that was ever vulnerable to Heartbleed 
(or still was as of April 30, 2014). We can observe a slight 
increase in likelihood of ever being vulnerable for the most 
popular sites, but the distribution quickly stabilizes. The 
increased likelihood of being vulnerable is likely because 
these sites have larger numbers of hosts. This trend is mir- 
rored in the hosts that are still vulnerable on April 30, 2014. 


4.3. Certificate reissues 

We now examine the reissuing of SSL certificates in the 
wake of Heartbleed. Not all SSL certificate reissues that we 
observe following Heartbleed’s announcement are due to the 
Heartbleed vulnerability. In particular, reissues can happen 
for at least two other reasons: first, the old certificate could 
be expiring soon. For example, before Heartbleed, we observe 
that 50% of certificates are reissued within 60 days of their 
expiry date. Second, a site may periodically reissue certifi- 
cates as a matter of policy (regardless of expiration date). 
For example, we observed that Google typically reissues the 
google.com certificate every two weeks. 

In this study, we wish to distinguish a Heartbleed-induced 
certificate reissue from a reissue that would have happened 
anyway. We define a certificate reissue to be Heartbleed- 
induced if all three of the following conditions hold: 


1. The date of reissue was on or after April 7, 2014 (the day 
Heartbleed was announced). 

2. The certificate that is reissued was not going to expire 
for at least 60 days. This eliminates certificates that 
were likely to be reissued in the near future anyway. 

3. We do not observe more than two other reissues for 
certificates with that CN in the time before Heartbleed. 
This implies that certificates with that name do not typi- 
cally get reissued more than once every three months. 


Thus, for the examples discussed so far, we do not con- 
sider the reissue of the retired certificate in Figure 2 to be 
Heartbleed-induced (as it happened before Heartbleed), 
and we do not consider any of the google.com reissues to be 
Heartbleed-induced (because we observed a total of 12 reis- 
sues of that certificate prior to Heartbleed). 
Heartbleed-induced reissues. Overall, we observe 36,781 
certificate reissues that we declare to be Heartbleed-induced 


Figure 4. Fraction of domains that have at least one host that was 
ever vulnerable to Heartbleed as a function of Alexa rank, as well as 
domains that continued to be vulnerable at the end of the study. 
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in the three weeks following the announcement; this is 8.9% 
of all certificates that were alive at the time Heartbleed was 
announced. Figure 5 examines the fraction of sites that 
have at least one Heartbleed-induced certificate reissue, as 
a function of Alexa rank; we can observe a strong correlation 
with Alexa rank. Higher-ranked sites are much more likely to 
have reissued at least one certificate due to Heartbleed (even 
though they are only slightly more likely to have been vul- 
nerable, as observed in Figure 4). This result complements 
previous studies’ findings that more popular websites often 
exhibit more sound administrative practices.*° 

Reissues with same key. System administrators who be- 
lieve that their SSL private key may have been compromised 
should generate a new public/private key pair when reissu- 
ing their certificate. We now examine how frequently this is 
done, both in the case of normal certificate reissues and for 
Heartbleed-induced reissues. 

Before Heartbleed, we observe that reissuing a certifi- 
cate using the same key pair is quite common; up to 53% of 
all reissued certificates do so. This high-level of key reuse 
is at least partially due to system administrators re-using 
the same Certificate Signing Request (CSR) when request- 
ing the new certificate from their CA. After Heartbleed, we 
observe a significant drop in the frequency of reissuing cer- 
tificates with the same key, that is, sites are generating a 
new key pair more frequently. However, if we focus on the 
Heartbleed-induced reissues, we observe that a non-trivial 
fraction (4.1%) of these certificates are reissued with the 
same key (thereby defeating the purpose of reissuing the 
certificate). In fact, we observe a total of 912 such certifi- 
cates coming from 747 distinct Alexa domains; these cer- 
tificates may represent cases where administrators believe 
they have correctly responded to Heartbleed, but their cer- 
tificates remain as vulnerable as if they had not reissued 
at all. 

Vulnerable certificates. Finally, we examine the certifi- 
cates that should have been reissued (regardless of whether 
they actually were); we refer to these certificates as vulner- 
able certificates. We declare a certificate to be vulnerable if 
the following three conditions hold: 


1. Its date of birth was before April 7, 2014. 

2. It has not expired as of April 30. 

3. It was advertised by at least one host that was (or is) 
vulnerable to Heartbleed. 


Figure 5. Fraction of domains that have at least one Heartbleed- 
induced reissue/revocation as a function of Alexa rank. 
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In other words, these certificates are vulnerable because 
their private keys could have been stolen by attackers. 
Overall, we find 107,712 vulnerable certificates. Of these, 
only 28,652 (26.7%) have been reissued as of April 30. The 
remaining 79,060 (73.3%) vulnerable certificates that have 
not been reissued come from 55,086 different Alexa Top-1M 
domains. Thus, the vast majority of SSL certificates that 
were potentially exposed by the Heartbleed bug remain in- 
use over three weeks after the vulnerability was announced. 


4.4, Certificate revocation 
We now turn to investigating certificate revocation before, 
during, and after the revelation of Heartbleed. Recall that it 
is critical that a vulnerable certificate be revoked: even if a 
site reissues a new certificate, if an attacker gained access 
to the vulnerable certificate’s private key, then that attacker 
will be able to impersonate the owner until either the cer- 
tificate expires or is revoked. We study both revocation and 
expiration here, and correlate them with rates of reissue. 
Overall revocation rates. Figure 3 shows the number of 
certificate revocations over time. As noted above, the average 
jumps from 29 revocations per day to 1,414 post-Heartbleed. 
However, the spike on April 16, 2014 is somewhat mislead- 
ing, as it was largely due to the mass-revocation of 19,384 
CloudFlare certificates."* 

To mitigate this issue, we plot in Figure 6 the number 
of unique domains that revoked at least one certificate over 
time. We make three interesting observations: First, the mag- 
nitude of the Heartbleed-induced spike is greatly reduced, 
but we still observe an up-to-40-fold increase in the number 
of domains issuing revocations per day. Second, we observe 
that the number of domains issuing revocations falls closer 
to its pre-Heartbleed level by April 28, suggesting that within 
3 weeks most of the domains that will revoke their certificate in 
direct response to Heartbleed already have. 

Third, we observe three “dips” in the post-Heartbleed 
revocation rate on April 13, April 20, and April 27—all 
weekends, indicating that far fewer revocations occur on 
the weekend relative to the rest of the week. This periodic- 
ity can also be (less-easily) observed in the pre-Heartbleed 
time frame. It is reasonable to assume revocations dip on 
weekends because humans are involved in the revocation 
process, however it is not clear who is responsible for the 
delays: is it site administrators or CRL maintainers at CAs 
(or both) who are not working on weekends? Regardless of 
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who is responsible, these weekend delays are problematic 
for online security, since vulnerabilities (and the attackers 
who exploit them) do not take weekends off. 

Heartbleed-induced revocations. Similar to certificate 
reissues, not all certificate revocations after April 7, 2014 
are necessarily due to Heartbleed (e.g., the site could have 
exposed their private key due to a different vulnerability). 
We therefore define a Heartbleed-induced revocation to be a 
certificate revocation where the certificate had a Heartbleed- 
induced reissue (see Section 4.3). 

Overall, we observe 14,726 Heartbleed-induced revoca- 
tions; this corresponds to 40% of all Heartbleed-induced 
reissued certificates. Thus, 60% of all certificates that were 
reissued due to Heartbleed were not revoked, implying that, 
if the certificate’s private key was actually stolen, the attacker 
still would be able to impersonate the victim without any cli- 
ents being able to detect it. 

Figure 5 presents the fraction of sites that have at least 
one Heartbleed-induced certificate revocation, as a function 
of Alexa rank. Similar to reissues, sites with high rank are 
slightly more likely to revoke. Ideally, the two lines in Figure 5 
should be coincident, that is, all sites reissuing certificates 
due to Heartbleed should also have revoked the retired cer- 
tificates. This result highlights a serious gap in security best- 
practices across all of the sites in the Alexa Top-1M. 

Finally, we examine revocation delay, or the number 
of days between when a certificate is reissued and it is 
revoked. Figure 7 presents the cumulative distribution of 
the revocation delay for both Heartbleed-induced and non- 
Heartbleed-induced revocations. To make the distributions 
comparable, we only look at differences between -10 and 
10 days (recall that Heartbleed-induced reissues and revo- 
cations can only occur after April 7, 2014, limiting that 
distribution). We observe that Heartbleed-induced revoca- 
tions appear to happen slightly more quickly, though not 
to the extent one might expect, given the urgent nature of 
the vulnerability. We also observe that revocation almost 
always happens after reissue, which makes sense, since this 
preserves the availability of HTTPS websites. This result 
contradicts previous assumptions® that revocations and 
reissues occur simultaneously. 

Expirations are not enough. To demonstrate how long the 
effects of the Heartbleed vulnerability will be felt if sites do 
not revoke their vulnerable certificates, we analyze vulner- 
able certificates that, by the end of our data collection, were 


Go —_—____Z 
Figure 7. Heartbleed-induced revocations were issued slightly faster 
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reissued but never revoked. Although we find that 60% of the 
certificates expire within a year, there are vulnerable certifi- 
cates that are valid for up to 5 years after Heartbleed was an- 
nounced. In fact, 10% of the vulnerable certificates still had 
over 3 years of validity remaining. We conclude from this 
that, given the meager rates of revocation, it would be helpful 
for CAs to shift to shorter expiry times in their certificates. 
Reissues and revocation speed. Next, we examine how 
quicklysitesrespondedtoHeartbleed.Figure8showsthefraction 
ofvulnerable certificates that were not reissued or revoked over 
the three weeks following the Heartbleed announcement. 
In this figure, the initial y values do not all start at 1.0 for reis- 
sues: this is because, with the coarse granularity of our data, we 
know the range of time during which some certificates were 
reissued, but not the precise day. We therefore provide the 
most optimistic possibility: ifwe know a certificate was reissued 
between days d and d + k, we assume it was reissued on day d. 

This figure presents a bleak view of how thoroughly 
sites revoke and reissue their certificates (note that the 
y-axis begins at 0.60). Three weeks after the revelation of 
Heartbleed, over 87% of all certificates we found to be vul- 
nerable were not revoked, and over 73% of them were not 
reissued. We also found that the revocation rate follows a 
pattern previously observed in earlier studies on the spread 
of patches”: there is an exponential drop-off, followed by 
a gradual decline. This behavior is even more pronounced 
when looking farther beyond the Heartbleed announce- 
ment: 16 weeks after the announcement, there were still 
86% who had not revoked and 70% who had not reissued. 

Extended validation certificates. Recall that one of the ma- 
jor roles of a CA is to validate the identity of the subjects who 
purchase certificates. Extended Validation (EV) certificates 
are a means by which CAs can express that this identity- 
verification process has followed (presumably) more strin- 
gent criteria. Many browsers present EV certificates differently 
in the address bar. 

EV certificates are standard X.509 certificates that are not, 
in and of themselves, more secure, but the rationale is that 
with a more thorough verification process by the CAs, these 
certificates deserve greater trust. That said, there remains 
concern as to whether this trust is well-placed. We close by 
investigating the rate at which vulnerable EV certificates were 
revoked and reissued as compared all certificates overall. 

Overall, Figure 8 shows EV certificates follow similar 
trends to the entire corpus, with a slightly faster and more 


Figure 8. Many vulnerable certificates were not revoked and reissued 
after Heartbleed (note that the y-axis does not begin at zero). 
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thorough response. Interestingly, while EV certificates were 
revoked more quickly, their non-EV counterparts caught 
up within 10 days; however, EV certificates were reissued 
both more quickly and more thoroughly. We expect that 
the underlying cause of this observation is a self-selection 
effect, that is, security-conscious sites are more likely to 
seek out EV certificates in the first place. Nonetheless, there 
are still many vulnerable EV certificates that have not been 
reissued three weeks after the event (67%) and that have not 
been revoked three weeks after (87%). 


5. CONCLUDING DISCUSSION 
In this paper, we studied how SSL certificates are reissued 
and revoked in response to a widespread vulnerability, 
Heartbleed, that enabled undetectable key compromise. We 
conducted large-scale measurements and developed new 
methodologies to determine how the most popular one mil- 
lion domains reacted to this vulnerability in terms of certifi- 
cate management, and how this impacts security for clients. 
We found that the vast majority of vulnerable certificates 
have not been reissued. Further, of those domains that reis- 
sued certificates in response to Heartbleed, 60% did not 
revoke their vulnerable certificates—if they do not eventu- 
ally become revoked, 20% of those certificates will remain 
valid for two or more years. The ramifications of these find- 
ings are alarming: Web browsers will remain potentially 
vulnerable to malicious third parties using stolen keys for 
a long time to come. Additionally, we found that domains 
with EV certificates performed only marginally better than 
other domains with respect to reissuing and revocation. 
Our results are, in some ways, in line with previous stud- 
ies on the rates at which administrators patched vulner- 
able software”—for instance, revocation rates followed a 
sharp exponential drop-off shortly after the vulnerability 
was made public, and tapered off soon thereafter. However, 
unlike software bugs, we find that the vast majority of cer- 
tificates remain vulnerable to attacks, as they have still not 
been reissued or revoked. These findings indicate that the 
current practices of certificate management are misaligned 
with what is necessary to secure the PKI. 


5.1. Surveying system administrators 

To help better understand the reasons behind the lack of 
prompt certificate reissues and revocations, we informally 
surveyed a few systems administrators. We asked what steps 
they had taken in response to Heartbleed: did they patch, 
reissue, and revoke, and if not, then why not? We received 
seven responses. Most reported manually patching their 
systems, but some relied on managed servers or automatic 
updates and therefore took no Heartbleed-specific steps. 
There was some variance in when patches were applied, 
due to a combination of scheduled reboots and delayed 
responses from vendors, but the majority of patches were 
applied quickly. 

For revoking and reissuing, however, we saw a wide spec- 
trum of behavior. The few who reissued and revoked did 
so within 48 hours. Many neither revoked nor reissued; a 
common reason provided was that the vulnerable hosts 
were not hosting sensitive data or services. Along similar 
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lines, others reported having reissued the certificate but not 
revoking, explaining that the certificate is only for internal 
use. Finally, others reported that they did not perceive reis- 
suing and revoking as important because they had patched 
quickly after the bug was publicly announced (recall, how- 
ever, that the vulnerability was introduced over 2 years prior). 

Our results from this small survey should be viewed 
anecdotally—more extensive surveys on certificate admin- 
istration would be an important area of future work®°— 
but they do shed light on the root causes of why revoking 
and reissuing are not on equal footing with patching. 
While administrators almost universally understand the 
importance of patching after a vulnerability, many do not 
appreciate or know about the importance of revoking and 
reissuing certificates with new keys. Of those administra- 
tors who do understand the importance, some reported 
push-back from others who perceived the process as being 
overly complex. In sum, this points to the need for broader 
education on the treatment of certificates, and perhaps 
more assistance from CAs to help ensure that all the pre- 
scribed steps are taken. 


5.2. Lessons learned 

Our results suggest several changes to common PKI prac- 
tices that may improve security. First, low revocation rates 
and long expiration dates form a dangerous combination. 
Techniques that automate revocation would vastly reduce 
the period during which clients are vulnerable to malicious 
third parties. Similarly, adopting short certificate expira- 
tion dates (as suggested by Topalovic et al.”') by default will 
significantly reduce the validity period of vulnerable certifi- 
cates. Second, mechanisms that enable simultaneous reis- 
sue-and-revoke for certificates will make it less likely that 
invalid certificates are accepted by clients. Third, we have 
found that many domains continue to serve old, vulnerable 
certificates even after they reissue. Given the large number 
of certificates and hosts using them per domain in our data- 
set, we believe administrators would benefit from tools that 
more easily track and validate the set of certificates they are 
using. 


5.3. Future work 

This paper is, we believe, the first step towards understand- 
ing the manual process of reissuing and revoking certifi- 
cates in the wake of a vulnerability. Several interesting open 
problems remain. Because our data focuses on the server 
and CA side of the PKI ecosystem, we are unable to draw 
any direct conclusions as to what clients experience. A host- 
centered measurement study would allow us to understand 
not only when revocations were added to CRLs, but when 
clients actually received the CRLs. Moreover, our study 
opens many questions as to why the certificate reissue and 
revocation processes are so extensively mismanaged. Our 
results reinforce previous findings that site popularity is 
correlated with good security practices, but even the high- 
est ranked Alexa websites show relatively anemic rates of 
reissues and revocations. Understanding the root causes 
for why the PKI is mismanaged is an important step toward 
developing a secure infrastructure. 
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5.4. Open source 

Our analysis relied on both existing, public sources of data 
and those we collected ourselves. We make all of our data 
and our analysis code available to the research community 


at https://securepki.org. 
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Harvard John A. Paulson School of 
Engineering and Applied Sciences 
Tenured Professor in Computer Science 


The Harvard John A. Paulson School of Engineer- 
ing and Applied Sciences (SEAS) seeks applicants 
for a position at the tenured level in the area of 
Artificial Intelligence with Societal Impact, with 
an expected start date of July 1, 2018. 

We seek a computer scientist whose research 
accomplishments include fundamental advanc- 
es in AI and impact through applications that 
improve societal well-being. We seek candidates 
who have a strong research record and a commit- 
ment to undergraduate and graduate teaching 
and training. We particularly encourage applica- 
tions from historically underrepresented groups, 
including women and minorities. 

Computer Science at Harvard is enjoying a 
period of substantial growth in numbers of stu- 
dents and faculty hiring, and in expanded facili- 
ties. We benefit from outstanding undergraduate 
and graduate students, world-leading faculty, an 
excellent location, significant industrial collabo- 
ration, and substantial support from the Harvard 
Paulson School. For more information, see http:// 
www.seas.harvard.edu/computer-science. 

The associated Center for Research on Com- 
putation and Society (http://ercs.seas.harvard. 
edu/), Berkman Klein Center for Internet & Soci- 
ety (http://cyber.harvard.edu), Data Science Ini- 
tiative (https://datascience.harvard.edu/), and In- 
stitute for Applied Computational Science (http:// 
iacs.seas.harvard.edu) foster connections among 
computer science and other disciplines through- 
out the university. 

Candidates are required to have a doctoral de- 
gree in computer science or a related area. 

Required application documents include a 
cover letter, CV, a statement of research interests, 
a teaching statement, and up to three represen- 
tative papers. Candidates are also required to 
submit the names and contact information for 
at least three references. Applicants can apply 
online at https://academicpositions.harvard.edu/ 
postings/8037. ° # «# | 

We are an equal opportunity employer and 
all qualified applicants will receive consideration 
for employment without regard to race, color, re- 
ligion, sex, sexual orientation, gender identity, 
national origin, disability status, protected veteran 
status, or any other characteristic protected by law. 


IU School of Informatics and 
Computing, IUPUI 
Associate Dean for Research 


The Indiana University School of Informatics and 
Computing (SoIC), IUPUI campus, invites appli- 
cations for a tenured associate or full professor in 
the growing field of data science (or related area) 
to fill the position of Associate Dean for Research. 
The appointment will begin August 1, 2018. Can- 


CAREERS 


didates must demonstrate an outstanding schol- 
arly record of research, exhibited by high-impact 
peer-reviewed publications, a forward-looking, 
vigorous research agenda and a demonstrated 
history of securing significant, competitive exter- 
nal funding. 

An exceptional researcher is sought to lead 
and expand the research enterprise of our school 
and contribute to the department’s growing data 
science academic program. All areas of data sci- 
ence will be considered including data mining, 
statistical machine learning, descriptive, predic- 
tive, and prescriptive analytics, cloud computing, 
distributed databases, high performance com- 
puting, data visualization, or other areas involv- 
ing the collection, organization, management, 
and extraction of knowledge from massive, com- 
plex, heterogeneous datasets. Data may include 
text, images, video, sensor and instrument data, 
clickstream data, social media interactions, neu- 
roimaging data, genomics, proteomics, or me- 
tabolomics data, etc. 

The overarching responsibility is to expand 
the research portfolio of the SoIC. Full details of 
this position can be found at https://indiana.peo- 
pleadmin.com/postings/5287 

Questions pertaining to this position can be 
directed to Jeff Hostetler, Assistant to the Dean at 
jehostet@iupui.edu 

The School of Informatics and Computing is 
eager to consider applications from women and mi- 
norities. Indiana University is an Affirmative Action/ 
Equal Opportunity Employer. IUPUI is an Affirma- 
tive Action/Equal Opportunity Institution M/F/D/V. 


National University of Singapore (NUS) 
Sung Kah Kay Assistant Professorship 


The Department of Computer Science at the Na- 
tional University of Singapore (NUS) invites appli- 
cations for the Sung Kah Kay Assistant Professor- 
ship. Applicants can be in any area of computer 
science. This prestigious chair was set up in mem- 
ory of the late Assistant Professor Sung Kah Kay af- 
ter his untimely demise early in his career at NUS. 
Candidates should be early in their academic ca- 
reers and yet demonstrate outstanding research 
potential, and strong commitment to teaching. 

The Department enjoys ample research fund- 
ing, moderate teaching loads, excellent facilities, 
and extensive international collaborations. We have 
a full range of faculty covering all major research ar- 
eas in computer science and boasts a thriving PhD 
program that attracts the brightest students from 
the region and beyond. More information is avail- 
able at www.comp.nus.edu.sg/careers. 

NUS is an equal opportunity employer that 
offers highly competitive salaries, and is situated 
in Singapore, an English-speaking cosmopolitan 
city that is a melting pot of many cultures, both 
the east and the west. Singapore offers high-qual- 
ity education and healthcare at all levels, as well 
as very low tax rates. 


Application Details: 
> Submit the following documents (in a single 
PDF) online via: https://faces.comp.nus.edu.sg 
- A cover letter that indicates the position ap- 
plied for and the main research interests 
e Curriculum Vitae 
+ Ateaching statement 
+ A research statement 
> Provide the contact information of 3 referees 
when submitting your online application, or, ar- 
range for at least 3 references to be sent directly 
to csrec@comp.nus.edu.sg 
> Application reviews will commence immedi- 
ately and continue until the position is filled 
If you have further enquiries, please contact 
the Search Committee Chair, Weng-Fai Wong, at 


csrec@comp.nus.edu.sg. 


Southern University of Science and 
Technology (SUSTech) 

Professor Position in Computer Science and 
Engineering 


The Department of Computer Science and Engi- 
neering (CSE, http://cse.sustc.edu.cn/en/), South- 
ern University of Science and Technology (SUS- 
Tech) has multiple Tenure-track faculty openings 
at all ranks, including Professor/Associate Profes- 
sor/Assistant Professor. We are looking for out- 
standing candidates with demonstrated research 
achievements and keen interest in teaching. 

Applicants should have an earned Ph.D. de- 
gree and demonstrated achievements in both 
research and teaching. The teaching language at 
SUSTech is bilingual, either English or Putong- 
hua. It is perfectly acceptable to use English in all 
lectures, assignments, exams. In fact, our exist- 
ing faculty members include several non-Chinese 
speaking professors. 

As a State-level innovative city, Shenzhen has 
identified innovation as the key strategy for its 
development. It is home to some of China’s most 
successful high-tech companies, such as Huawei 
and Tencent. SUSTech considers entrepreneur- 
ship as one of the main directions of the univer- 
sity. Strong supports will be provided to possible 
new initiatives. SUSTech encourages candidates 
with experience in entrepreneurship to apply. 

The Department of Computer Science and 
Engineering at SUSTech was founded in 2016. 
It has 17 professors, all of whom hold doctoral 
degrees or have years of experience in overseas 
universities. Among them, two were elected into 
the “1000 Talents” Program in China; three are 
IEEE fellows; one IET fellow. The department is 
expected to grow to 50 tenure track faculty mem- 
bers eventually, in addition to teaching-only pro- 
fessors and research-only professors. 

SUSTech is a pioneer in higher education 
reform in China. The mission of the University 
is to become a globally recognized research uni- 
versity which emphasizes academic excellence 
and promotes innovation, creativity and entre- 
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preneurship. Set on five hundred acres of wooded 
landscape in the picturesque Nanshan (South 
Mountain) area, the campus offers an ideal en- 
vironment for learning and research. SUSTech 
is committed to increase the diversity of its fac- 
ulty, and has a range of family-friendly policies 
in place. The university offers competitive sala- 
ries and fringe benefits including medical insur- 
ance, retirement and housing subsidy, which are 
among the best in China. Salary and rank will 
commensurate with qualifications and experi- 
ence. More information can be found at http:// 
talent.sustc.edu.cn/en o 

We provide some of the best start-up pack- 
ages in the sector to our faculty members, includ- 
ing one PhD studentship per year, in addition toa 
significant amount of start-up funding. 

To apply, please provide a cover letter iden- 
tifying the primary area of research, curriculum 
vitae, and research and teaching statements, and 
forward them to cshire@sustc.edu.cn. 


Swarthmore College 
Computer Science Department 
Visiting Faculty Positions in Computer Science 


The Computer Science Department invites ap- 
plications for multiple visiting positions at the 
rank of Assistant Professor to begin Fall semester 
2018. 

For the visiting position, strong applicants in 
any area will be considered. Priority will be given 
to complete applications received by February 15, 
2018. 

Applications will continue to be accepted af- 
ter these dates until the positions are filled. 

The Computer Science Department currently 
has eight tenure-track faculty and four visiting 
faculty. Faculty teach introductory courses as well 
as advanced courses in their research areas. We 
have grown significantly in both faculty and stu- 
dents in the last five years. Presently, we are one 
of the most popular majors at the College and 
expect to have over 70 Computer Science majors 
graduating this year. 

QUALIFICATIONS: Applicants must have 
teaching experience and should be comfortable 
teaching a wide range of courses at the introduc- 
tory and intermediate level. Candidates should 
additionally have a strong commitment to involv- 
ing undergraduates in their research. A Ph.D. in 
Computer Science at or near the time of appoint- 
ment is required. The strongest candidates will 
be expected to demonstrate a commitment to 
creative teaching and an active research program 
that speaks to and motivates undergraduates 
from diverse backgrounds. 

APPLICATION INSTRUCTIONS: Applica- 
tions should include a cover letter, vita, teach- 
ing statement, research statement, and three 
letters of reference, at least one (preferably two) 
of which should speak to the candidate’s teach- 
ing ability. In your cover letter, please briefly de- 
scribe your current research agenda; what would 
be attractive to you about teaching diverse stu- 
dents in a liberal arts college environment; and 
what background, experience, or interests are 
likely to make you a strong teacher of Swarth- 
more College students. 

This institution is using Interfolio’s Fac- 
ulty Search to conduct this search. Applicants 
to this position receive a free Dossier account 
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and can send all application materials, includ- 
ing confidential letters of recommendation, 
free of charge. To apply, visit https://apply.in- 
terfolio.com/45234. 

Swarthmore College actively seeks and wel- 
comes applications from candidates with ex- 
ceptional qualifications, particularly those with 
demonstrable commitments to a more inclusive 
society and world. Swarthmore College is an 
Equal Opportunity Employer. Women and mi- 
norities are encouraged to apply. 


Toyota Technological Institute 
Principal Professor 


Toyota Technological Institute has one opening 
for “Principal Professor” positions in its School 
of Engineering. For more information, please re- 
fer to the following website: http://www.toyota-ti. 


ac.jp/english/employment/index.html. 


Research fields: Science and technology for 
advanced instrumentation and/or information 
processing 

Examples: New devices and systems for advanced 
information processing, communication, and/ 
or sensing; Leading instrumentation technolo- 
gies for ultra-sensitive measurements and/or 
bio-medical studies and diagnosis; Science and 
technology of cyber-physical systems. 


Qualifications: A successful candidate must 
have a Ph. D. degree or the equivalent in a rel- 
evant field; he/she must possess outstanding 
competence to promote world-class research 
program(s) as well as to conduct excellent 
teaching and research supervision for graduate 
and undergraduate students, so as to fulfill his/ 
her mission as a superb leader in research and 
education. 


Positions: Principal Professor 

The “Principal Professor” will serve as the head 
of a “unit laboratory,” that consists of the Prin- 
cipal Professor, one associate professor, and 
three post-doctoral fellows. A start-up grant of 
about one hundred million Japanese yen (ca. 
one million US dollars) is available. In addition, 
a research budget of about ten million Japanese 
yen (ca. one hundred thousand US dollars) will be 
given each year to promote research programs for 
a period of five years. At the end of this five-year 
term, the principal professor will be given a for- 
mal evaluation. 


Number of Positions Available: One 
Start date: April 1, 2019 or on the date of the earli- 
est convenience 


Documents: 
1. A curriculum vitae 
2. A list of publications 
3. Copies of 5 representative papers 
4.An outline of research and educational 
accomplishments (about 3-pages) 
5.A future plan of educational and research 
activities (about 3- pages) 
6. Names of two references, including phone 
numbers and e-mail addresses 
7.An application form (available on our 
website) 
Deadline: May 15, 2018 
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Inquiries: 
Search Committee Chair, Dr. Kazuo Hotate, Vice 
President & Professor. 
(Phone) +81-52-809-1821 (E-mail) hotate- 
koubo@toyota-ti.ac.jp 
The above documentation should be sent to: 
Mr. Masashi Hisamoto 
Toyota Technological Institute 
2-12-1, Hisakata, Tempaku-ku, Nagoya, 468- 
8511 Japan 


Please write “Application for Principal Profes- 
sor” on the envelope. 
The application documents will not be returned. 


Western Michigan University 
Assistant/Associate Professor in Computer 
Science 


Applications are invited for a tenure-track posi- 
tion at the assistant or associate professor level in 
the area of applied information security in the De- 
partment of Computer Science at Western Michi- 
gan University (Kalamazoo, MI) starting August 
2018 or January 2019. 

Applicants must have a Ph.D. in Computer 
Science or a closely related field. We are looking 
for candidates with expertise in applied informa- 
tion security to support our new M.S. in Informa- 
tion Security. The program is offered fully online 
and in cooperation with the Department of Busi- 
ness Information Systems. 

Successful candidates will be capable of es- 
tablishing an active research program leading 
to funding, supervising graduate students, and 
teaching courses at both the undergraduate and 
graduate levels in information security. Other 
duties include development of undergraduate 
and graduate courses, advising and service at 
the University, College, Department and profes- 
sional society levels. 

Application screening will start immediately 
and the position will remain open until filled. 
Successful candidates must earn their Ph.D. de- 
gree by the time of employment. 

The Department has 260 undergraduates, 
50 M.S. students and 45 Ph.D. students. Cur- 
rent active research areas include security, pri- 
vacy, networks, embedded systems/internet 
of things, compilers, computational biology, 
massive data analytics, scientific computing, 
parallel computing, formal verification, paral- 
lel debugging, and data mining. More infor- 
mation regarding Western Michigan Univer- 
sity, the College of Engineering and Applied 
Sciences and the Department of Computer 
Science are available at http://www.wmich. 
edu, http://www.wmich.edu/engineer, and 
http://wmich.edu/cs, respectively. 

The Carnegie Foundation for the Advance- 
ment of Teaching has placed WMU among the 
76 public institutions in the nation designated as 
research universities with high research activity. 

WMU is an Equal Opportunity/Affirmative Ac- 
tion Employer. Minorities, women, veterans, in- 
dividuals with disabilities and all other qualified 
individuals are encouraged to apply. 

To do so, please visit: http://wmich.edu/hr/ 
jobs and provide a cover letter, curriculum vitae, 
‘statement of research goals, teaching statement, 
and names and contact information of at least 
three references. 


[CONTINUED FROM P. 120] 

In one sense, it’s very practical. 
Deep learning has been successful 
not just because it works well, but 
also because it automates part of the 
process of building and designing in- 
telligent systems. In the old days, ev- 
erything was manual; you had to find 
a way to express all of human knowl- 
edge ina set of rules, which turns out 
to be extremely complicated. Even 
in the more traditional realm of ma- 
chine learning, part of the system was 
trained, but most of it was still done 
by hand, so for classical computer vi- 
sion systems, you had to design a way 
to pre-process the image to get it into 
a form that your learning algorithm 
could digest. 


With deep learning, on the other hand, 
you can train an entire system more or 
less from end to end. 

Yes, but you need a lot of labeled 
data to do it, which limits the number 
of applications and the power of the 
system, because it can only learn what- 
ever knowledge is present within your 
labeled datasets. The more long-term 
reason for trying to train or pre-train a 
learning system on unlabeled data is 
that, as you said, animals and humans 
build models of the world mostly by 
observation, and we’d like machines to 
do that as well, because accumulating 
massive amounts of knowledge about 
the world is the only they will eventually 
acquire a certain level of common sense. 


What about adversarial training, in 
which a set of machines learn together 
by pursuing competing goals? 

This is an idea that popped up a few 
years ago in Yoshua Bengio’s lab with 
Ian Goodfellow, one of his students at 
the time. One important application is 
predictions. If you build a self-driving 
car or any other kind of system, you’d 
like that system to be able to predict 
what’s going to happen next—to simu- 
late the world and see what a particular 
sequence of actions will produce with- 
out actually doing it. That would allow 
it to anticipate things and act accord- 
ingly, perhaps to correct something or 
plan in advance. 


How does adversarial training address 
the problem of prediction in the pres- 
ence of uncertainty? 


When I show you a segment of a 
video and I ask what happens next, you 
might be able to predict to some ex- 
tent, but not exactly; there are proba- 
bly several different outcomes that are 
possible. So when you train a system to 
predict the future, and there are sev- 
eral possible futures, the system takes 
an average of all the possibilities, and 
that’s not a good prediction. 

Adversarial training allows us to 
train a system where there are multiple 
correct outputs by asking it to make a 
prediction, then telling it what should 
have been predicted. One of the central 
ideas behind this is that you train two 
neural networks simultaneously; there 
is one neural net that does the predic- 
tion and there’s a second neural net 
that essentially assesses whether the 
prediction of the first neural net looks 
probable or not. 


You recently helped found the Partner- 
ship on AI, which aims to develop and 
share best practices and provide a plat- 
form for public discussion. 

There are questions related to 
the deployment and perception of 
AI within the public and govern- 
ment, questions about the ethics of 
testing, reliability, and many other 
things that we thought went beyond 
a single company. 


Thanks to rapid advances in the field, 
many of these questions are coming 
up very quickly. It seems like there’s a 
lot of excitement, but also a lot of ap- 
prehension in the public about where 
Al is headed. 

Humans make decisions under 
what’s called bounded rationality. We 
are very limited in the time and effort 
we can spend on any decision. We are 
biased, and we have to use our bias 
because that makes us more efficient, 
though it also makes us less accurate. 
To reduce bias in decisions, it’s better 
to use machines. That said, you need 
to apply AI in ways that are not biased, 
and there are techniques being devel- 
oped that will allow people to make 
sure that the decisions made by AI sys- 
tems have as little bias as possible. 


Leah Hoffmann is a technology writer based in Piermont, 
NY, USA. 
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The Network Effect 


The developer of convolutional neural networks 
looks at their impact, today and in the long run. 


DEEP LEARNING MIGHT be a booming 
field these days, but few people re- 
member its time in the intellectual 
wilderness better than Yann LeCun, 
director of Facebook Artificial Intelli- 
gence Research (FAIR) and a part-time 
professor at New York University. Le- 
Cun developed convolutional neural 
networks while a researcher at Bell 
Laboratories in the late 1980s. Now, 
the group he leads at Facebook is us- 
ing them to improve computer vision, 
to make predictions in the face of un- 
certainty, and even to understand nat- 
ural language. 


Your work at FAIR ranges from long- 
term theoretical research to applica- 
tions that have real product impact. 
We were founded with the idea of 
making scientific and technological 
progress, but I don’t think the Face- 
book leadership expected quick re- 
sults. In fact, many things have had 
a fairly continuous product impact. 
In the application domain, our group 
works on things like text understand- 
ing, translation, computer vision, im- 
age understanding, video understand- 
ing, and speech recognition. There are 
also more esoteric things that have had 
an impact, like large-scale embedding. 


This is the idea of associating every ob- 
ject with a vector. 

Yes. You describe every object 
on Facebook with a list of numbers, 
whether it’s a post, news item, photo, 
comment, or user. Then, you use op- 
erations between vectors to see if, say, 
two images are similar, or if a person is 
likely to be interested in a certain piece 


of content, or if two people are likely to 
be friends with one another. 


What are some of the things going 
on at FAIR that most interest or ex- 
cite you? 

It’s all interesting! But I’m person- 
ally interested in a few things. 

One is marrying reasoning with 
learning. A lot of learning has to do with 
perceptions, which are relatively simple 
things that people can do without think- 
ing too much. But we haven’t yet found 
good recipes for training systems to do 
tasks that require a little bit of reason- 
ing. There is some work in that direc- 
tion, but it’s not where we want it. 

Another area that interests me is 
unsupervised learning—teaching ma- 
chines to learn by observing the world, 
say by watching videos or looking at 
images without being told what objects 
are in these images. 
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And the last thing would be auton- 
omous AI systems whose behavior is 
not directly controlled by a person. In 
other words, they are designed not just 
to do one particular task, but to make 
decisions and adapt to different cir- 
cumstances on their own. 


How does the interplay work between 
research and product? 

There’s a group called Applied Ma- 
chine Learning, or AML, that works 
closely with FAIR and is a bit more on 
the application side of things. That 
group did not exist when I joined Face- 
book, but I pushed for its creation, be- 
cause I saw this kind of relationship 
work very well at AT&T. Then AML be- 
came a victim of its own success. There 
was so much demand within the com- 
pany for the platforms they were devel- 
oping, which basically enabled all kinds 
of groups within Facebook to use ma- 
chine learning in their products, that 
they ended up moving away from FAIR. 

Recently we reorganized this a little 
bit. A lot of the AI capability is now be- 
ing moved to the product groups, and 
AML is refocusing on the advanced 
development of things that are close 
to research. In certain areas like com- 
puter vision, there is a very, very tight 
collaboration, and things go back and 
forth really quickly. In other areas that 
are more disruptive or for which there 
is no obvious product, it’s more like, 
‘let us work on it for a few years first’. 


Let’s talk about unsupervised learning, 
which, as you point out elsewhere, is 
much closer to the way that humans ac- 
tually learn. [CONTINUED ON P. 119] 
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